Programmable Data Infrastructure
Success requires a new approach, and Gartner’s September 2018 Data Management Hype Cycle featured a new entrant rising on the “Innovation Trigger” curve: DataOps.
Nov 16, 2018
This article originally appeared on Forbes.com as part of Delphix CTO Eric Schrock’s ongoing column. See the original post here.
It’s no secret that companies are struggling with digital transformation. Companies know they need to innovate to win, yet 72% of executives feel they are being out-innovated by their competitors, according to PwC’s Innovation Benchmark report. Just when they get their arms wrapped around becoming a software company, they need to start a new journey to become a data-driven enterprise.
Whether you’re a data scientist, software developer or business analyst, you need access to relevant high-quality data. But the industry’s focus on individual systems and applications has spawned a cacophony of voices, silos and processes that inhibit access to that data. Success requires a new approach, and Gartner’s September 2018 Data Management Hype Cycle featured a new entrant rising on the “Innovation Trigger” curve: DataOps. Here's how Gartner defines the term:
"DataOps is a collaborative data management practice focused on improving the communication, integration, and automation of data flows between data managers and consumers across an organization. The goal of DataOps is to create predictable delivery and change management of data, data models and related artifacts. DataOps uses technology to automate data delivery with the appropriate levels of security, quality, and metadata to improve the use and value of data in a dynamic environment_."_
Central to DataOps is the need to align people, processes and technology around the flow of data in the enterprise. Through organizational and technological change, DataOps promises to accelerate innovation by providing everyone ready access to quality data where they need it while maintaining appropriate security and privacy controls.
Early momentum for DataOps has focused heavily on data science and analytics, and for good reason. Data-driven insights are a key way of leveraging data to drive differentiation. But it’s not solely about analytics. A year ago, I wrote about the different areas of friction that DataOps can address, a sentiment echoed in Gartner’s conclusion that makes no mention of any specific persona, domain or application of DataOps principles.
Whether they know it or not, today every company is a data company. They face an increasingly fast-moving business landscape filled with more data-driven competitors. Data science is the key to real-time insights, but companies are struggling to make the right data available fast enough. Modern enterprises need agile, flexible and responsive data pipelines that can deliver fresh data while adapting to the ever-changing data landscape. Solving this requires more than just transforming and delivering data but also automated tools that make it easy to discover relevant datasets, track and version machine learning data models, manage data preparation and cleansing, and share analytics queries.
Applications are becoming more intelligent and more data-intensive, processing ever-increasing amounts of data, incorporating machine learning and predictive engines and creating highly personalized consumer experiences. But despite the advances in DevOps and cloud that have accelerated software delivery, time to market and ability to deliver innovation to the business remains hampered by bottlenecks associated with manual, ad hoc mechanisms to secure, copy and move data. DataOps helps facilitate the automated delivery of realistic data anywhere it’s needed, creating readily accessible catalogs of test and production data to be used wherever it’s needed during development, resulting in greater velocity and higher quality.
With weekly data breaches, emerging General Data Protection Regulation (GDPR)-inspired regulations such as the California Consumer Privacy Act (CCPA), and the next Cambridge Analytica around the corner, companies cannot risk jeopardizing customer trust with lax data security and privacy. Data privacy and security is now a top imperative for businesses, and meeting regulatory mandates is a key hurdle to overcome.
But the easiest solution -- preventing access -- is also the most harmful. Teams that can’t get access to the data they need have to either stop innovating or use suboptimal data that puts their project at risk. Comprehensive DataOps helps identify risk as data flows into, across and out of companies and leverages technology approaches like data masking, differential privacy and homomorphic encryption to mitigate risk without losing business value.
There is great synergy around the high-level concepts of enabling enterprise data flow. For example, Harvard Business Review wrote about five key elements of a successful data strategy that closely mirrors DataOps, without using the term itself. There is little commonality at the technology and practice level. Each instantiation uses different language, techniques, tool and principles to explain how to apply DataOps.
This shifting landscape can be daunting for executives. Waiting may seem like the prudent choice, but waiting won’t win the innovation race. Instead, leaders can focus today on organizational readiness: Who understands where data is needed to meet business objectives? How is information flowing across the business today? How can teams collaborate together to ensure availability and privacy of data for all consumers? If this requires getting a dozen people together for a months-long project, you’re not ready for DataOps. Start the conversation today to find the leaders and teams that can orient your company around the flow of information and are willing to break the constraints of legacy systems and processes.
A truly transformational movement doesn’t start with a single company or person articulating the complete answer on day one. It takes many different voices, vendors and practitioners exploring the space and building proven approaches, not just theories. Because DataOps is as much a people and organizational shift as it is a technology one, the problem only gets more complicated. Over the next year, we’ll likely continue to see the expansion of DataOps principles and approaches that further blur the term, ultimately starting to coalesce as proven practices emerge in the industry.
DataOps does not appear to be a passing fad. There is clear value and need, as seen by the dozens of vendors and companies adopting the ideas and analysts predicting its rise as a durable part of the enterprise landscape. The exciting part is that it’s still evolving, and those who lend their voices and ideas to the movement will have an enduring effect on a major industry transformation.