The Digital Transformation Divide

DataOps is the alignment of people, process, and technology to enable the rapid, automated, and secure management of data.

A few days ago at the NASDAQ center in San Francisco, I caught up with MRE CIO Ken Piddington, who also serves as an Executive Advisor to CIOs.  “Top of mind with CIOs and IT shops I’m talking to," said Ken, “is Data Transformation.” In fact, he often hears key players tell him, “I’m part of the Data Transformation Group.” The problem is that Data Transformation has come to mean so many different things to CIOs that it's hard to define, and even harder to relate new data innovations into their journey.

Digital transformation is a data-driven interconnectedness that impels hyper-awareness, well-informed decision-making, and rapid execution. Within this context, three key innovations are changing the Data Transformation Journey for CIOs:

  • Data is free to roam
    • Applying the principles of DataOps* to Thin/Sparse clones has effectively decoupled Database Servers from their Content. It used to be that moving data (like a 5 Tb ERP app) was torturous, requiring lots of time and expertise. But, DataOps solutions give Data Scientists, Analysts, Developers and Testers the power to provision fresh, personalized and secure copies of such environments in minutes. The kicker is that these copies are mobile and untethered from the Data Producer. Moving my 5 Tb ERP from Amazon to Azure can be accomplished in 10 minutes. In fact, such solutions make it simple both to cross and move the cloud boundary. That’s powerful.
  • Data Encapsulation amps up our velocity
    • We’re realizing in the data community what developers knew all along: just like encapsulation unentangled code and made it far easier to scale, encapsulating data and the controls we need for it is accomplishing massive scale for Data Consumers. By setting embedded data controls at “dataset creation time”, Data Operators (who want to make sure secure data never gets out) can control access, portability, masking, and a whole host of other available controls that persist with the dataset. This untethers those Data Operators from those Data Consumers. With security in place and persistent, Data Consumers use the data where they want, move it where they want (within the rules), and never have to go back for permission. It seems simple, but the request-to-provision step of our Data Supply Chain is often the most cumbersome, slowest, and most prone to bottlenecking part of the application delivery cycle for almost everyone who builds applications.
  • Data Synchronicity is a lot less expensive
    • Many make a distinction between “physical” transformations (like converting from Unix to Linux) and “logical” transformations  (such as you might do with your ETL). But, the dirty little secret of ETL (and of MDM for that matter) is that a huge chunk of the time spent has to do with time logic (e.g., How can I put data from sources A, B, and C in the right order when they arrive out of order?). DataOps solutions also contain features that place the entire histories of many datasets at your fingertips. Yes, you can ask for the content of Source A, B, and C as it looked at the same point in time (not the time you received the file). All the effort to massage data to get it to all match up in time is simply unnecessary if you control the time pointer. Again, it seems simple, but the reset-to-a-common-point step of our Data Supply Chain is another cumbersome, slow, and involved process that slows down our application delivery cycle.

Data Interconnectedness offers challenges we don’t understand. What we do know is that 84% of companies fail at digital transformation. They fail because they believe data mobility is still hard. They fail because they still operate as though data is anchored and bounded by the vendors’ server in which it is stored, or the fear of data leakage by security controls that are loosely coupled to the data. And, they have yet to take advantage of the simplification DataOps solutions can bring to complex, composite applications. The old adage is still true, When you don’t know what to manage, you manage what you know.

New Destinations for your Data Journey

For CIOs just learning about DataOps, there are clear benefits for their journey to digital:

  • DataOps solutions give you the power to commoditize cloud providers, and make the cloud boundary fluid.
    • Since your dataset is mobile and secure and decoupled, there’s no reason you can’t move it seamlessly and quickly from Amazon to Azure in minutes. Moreover, you can decide to move a dataset from your prem up to the cloud or from the cloud back to prem in minutes. Switching costs have fallen dramatically, and cloud vendor lock-in can be a thing of the past.
  • DataOps solutions kill the friction between Data Producers and Data Consumers making App Development and tasks like Cloud Migration much faster
    • The security and process bottlenecks your Developers, Testers, Analysts and Data Scientists experience accessing the data they need will diminish dramatically. Setting masking and access controls at creation time keeps Data Consumers in a safe space. Giving data consumers direct control over all of the usual operations they want to do (rollback, refresh, bookmark, etc.) squelches down all those requests to your infrastructure team to near zero. Applications move forward at the speed of developers and testers, not the speed of your control process. Longitudinal studies show this can result in a 30-50% increase in application delivery velocity.
  • DataOps also amp up the Speed and Velocity of composite applications.
    • A lot of times, it doesn’t matter how fast you can deliver one app; it’s how fast you can deliver them all. By giving you time-synchronized access to not just one but many datasets, all sorts of problems disappear. You can create an end-to-end test environment for your 40 applications and it can be up in hours not months. You can roll the whole thing back. You can have all the fresh data you need to feed your ETL or your MDM or your data lake on command.  Data Virtualization makes those datasets not only fast and mobile, it makes them cheap too.

DataOps is disrupting our assumptions and our approached to Data Transformation. And, it’s the right concept to help those folks in the “Data Transformation Group” cross the digital divide.

DataOps is the alignment of people, process, and technology to enable the rapid, automated, and secure management of data. Its goal is to improve outcomes by bringing together those that need data with those that provide it, eliminating friction throughout the data lifecycle.