Our Top 4 Data-Related DevOps Predictions for 2019
DevOps is no longer just a trendy new idea. It’s now a familiar and proven approach to building applications that are being practiced by organizations, and one that continues to evolve as enterprises start to adopt DevOps at industrial scale.
As the latest IDC study shows that the DevOps software market is estimated to reach $6.6 billion in 2022, our DevOps experts at Delphix predict the top four data-related trends set to shape the future of DevOps in 2019 and beyond.
Everyone is trying to break the monolith, and it’s not just in the application. As organizations begin to break large monolithic applications into microservices-based architectures, they are seeing the need to re-architect their data repositories too. Data repositories need to move from large, highly normalized databases to datastores that are repositories for the specific subset of data required by a microservice.
For any microservice-based architecture, you end up moving from a classical monolithic data stores to several fit-for-purpose datastores, and every microservice team will use its own datastore because they need a specific subset of data for their needs.
In order to archive this, organizations need to better categorize and segment their data to determine what goes into what datastore and what data should be allowed where. Both the data architecture and the governance of the data hence need to evolve to meet these needs of modern applications.
DevOps for Data Science: Applying Software Engineering Principles to Data
Managing data models is hard. How do you version different models? How do you then correlate that with the data they were originally trained on? What we’re talking about is managing data as code particularly in the world of data science. Bringing data models and training data into the picture becomes really interesting because there are difficult problems companies are wrestling with as these systems become less understandable and less observable. As a result, being able to track, test and version these things effectively becomes much more important.
From a DevOps perspective, you can do this by managing, changing and collaborating around data the way you do with code. Many times, when you’re working with people or teams who are building machine learning projects, they may not know basic software engineering practices. However, applying those principles of managing data the way you do with code can be incredibly valuable.
Moving From DevOps to DevSecOps
While DevOps is all about applying lean principles to accelerate feedback and improve time to value, there are three important dimensions of security in a DevOps enterprise – what is being referred to as DevSecOps. The first has to do with securing the perimeter, such as controlling access to your environments, both production and non-production. Next, you must secure the delivery pipeline itself, which includes eliminating vulnerabilities related to the software supply chain, insider attacks, errors within the development project, and weaknesses tied to the design, code, and integration.
Your security practices must ensure that anyone who has access to the delivery pipeline cannot insert malicious code or maliciously access production data. The third step has to do with securing the application itself, ensuring that there is proper identity management to control access to the application and associated data when running in production.
Growing Prominence of Data Privacy
While there has been a lot of focus on integrating security processes within DevOps, data privacy has taken a back seat. Data privacy is all about mitigating risk within the data, independent of who accesses it, whereas security has to do with ensuring that only the right people have access to the data.
For example, consider what happens when a developer decides to collect new location data from a mobile app. What’s the feedback loop in the process, so that customers understand why the app is gathering that data? What are the privacy implications? How do you communicate to the user we’re doing that? What controls do we give them over that? If we do decide to gather and store the data in production, how do we then manage privacy as the data flows into non-production? More importantly, how do you ensure that data is used for the right purposes?
As DevOps evolves and scales in a world that is driven by application modernization, the need for democratization of data for analytics and data science, security and privacy, and the process in which one manages and governs data also needs to shift. Data needs to be freely available in a secure and compliant manner in both production and non-production environments.
Developers, testers and other practitioners or stakeholders in the delivery pipeline need to be able to get the data they need, when they need it, in the manner and time they need it – while ensuring that the data is made available in a secure and compliant manner. This is no easy task and will require the focus of organizations as they continue down the DevOps adoption path in 2019 and beyond.
Download the “Delphix for DevOps” datasheet to learn how the Delphix Data Platform can integrate with your existing DevOps tools and workflows and accelerate application development for software development teams across your enterprise.