Why I Joined Delphix: Opening up the Opportunity of Data
The world of IT continues to undergo massive change. In my opinion, there are three main forces driving change today:
- Cloud Adoption
- Application Modernization
- AI and Machine Learning
As I see these changes – or really, transformations – take hold, begin to mature and adopt at scale in large enterprises, the core impediments dampening speed and scale are all in the area of accessing data. How can I, whether I am a data producer or a data consumer, get access to the (right) data I need, when I need it, in the format and location that I need it in a secure and compliant manner - all at minimal cost and lag?
The challenges are mainly due to the fact that enterprises have their data locked inside legacy systems, and data is not easy to move. Unlike code, you cannot just start afresh (even with code, starting afresh is always the last resort). Data is expensive to store, expensive to move and even when the cost can be justified, it’s difficult to move or duplicate.
Data also has tremendous inertia. The root causes of the inertia are not just technology-driven but are also process and culture-driven. Who has access to data, how can one access data and how is that data managed, governed and secured, are all process and culture-driven. Changing culture, as anyone who has led any transformation will tell you, is not easy. Data transformation, like a DevOps transformation, is all about people.
Let's look at each of these forces of change in more detail.
As cloud services become commoditized and democratized, organizations are seeing the value in migrating their applications to the cloud to enhance and optimize their workloads. Massive server farms in data centers, expensive middleware and physical appliances are no longer necessary to deliver the capabilities your applications need. With cloud services, one can acquire the requisite capabilities to deliver non-functional requirements (NFRs) as well as most generic functionality of applications.
This allows developers to focus on the core software code in the application — the organization’s IP that delivers differentiated business value to clients. I am defining cloud adoption as not just leveraging someone else’s server you can use to turn CapEx to OpEx, but rather to consume cloud services in one’s applications, going beyond a lift-and-shift to true cloud transformation. This requires not only re-architecting applications to make them cloud-enabled or moving to cloud-native, but also changing to the IT organization’s structure and processes to become a cloud services consumer.
The data access challenge presented here has to do with data gravity. Applications cannot be far from the data. The right subsets of data need to be made available to the application in the cloud and when the application needs it, with minimal latency and time lag. The ideal scenario would be to move all the data to the cloud where your applications live, but that is not realistic.
All your apps do not live in the cloud – it is a hybrid world. The cost of moving data is high, and cost of storage of large volumes of data can be even more expensive over time. Remember, you are renting storage on the cloud. Securing Data in the cloud in a compliant manner presents its own sets of challenges.
With or without cloud adoption in the mix, applications are being modernized. Older legacy, or as one of my former clients called it “heritage” systems, need to be updated or replaced to deliver the business value demanded by today’s mobile-social, always-connected consumers. Disruptors are “Uber-ing” every industry. Every organization is now a software-driven organization.
By application modernization, I am not limiting the scope to refactoring legacy monoliths into microservices or re-architecting applications into cloud native applications, leveraging cloud services referenced above. I mean modernizing the very capability to deliver applications. Modernizing by adopting application delivery life cycles that incorporate Agile and DevOps, moving to a modern, polyglot development platform, reorganizing siloed teams into a Squads-Guilds-Tribe model, and yes, delivering applications components as microservices (deployed to containers).
Most of these changes are underway in the majority of organizations. Even laggards are now adopting DevOps, and everyone is looking at microservices and containers. What they are now struggling with is scaling their DevOps adoption beyond the proverbial “two-pizza” teams. They are struggling with the organizational change that needs to be in place to be a truly modern application delivery organization that delivers cloud-native, 12-factor applications at scale. And yes, they’re also struggling with data accessibility, data management, and governance at scale in this new world.
I recently had the opportunity to review the architecture of a massive, complex and highly distributed, cloud-native application that had several hundred microservices orchestrated to deliver fairly sophisticated business services. The application had six different datastores. Even more surprising was that the application architects were able to provide solid justification for each – from one datastore for app-generated transient data caching, to another for storing index data. This is not unique.
However, it does present complex data management and governance challenges. The core system of record data still lives in a large relational database. Data from there needs to be classified, segmented (and micro-segmented) and made available in the right datastore at the right time. Then it needs to be written back to the backend RDBMS, while allowing for the need to address any conflicts that arise between all the duplicate data instances, and of course minimize latency at all times. And did I mention, all of this must be done in a secure, compliant manner?
AI and Machine Learning
I leave AI and machine learning, or the all-encompassing buzzword “Big Data,” as the last driving factor to address because this one is the least mature of all three forces of change. At the same time, it’s the one with the most transformative potential. We are headed into a world where models can be trained to make business decisions at the edge, where business rules are not hardcoded, but learned and relearned over time. Machines can gain experience and codify it.
This presents even more unique problems when it comes to data. Of course, the data needs to be secured and masked to prevent wholesale data exposure to the models being trained, which could then be misused (think Cambridge Analytica), but the data also needs to be refined (data is the new oil, after all). Unrefined data can result in biased or even poisoned models. Remember Amazon’s AI recruiting tool which only hired a certain race, or the Microsoft Chatbot that went full racist after conversing with users who purposely fed the bot racist messages and curse words to mess with its learning model. Experience, including bad experience, gets codified.
As I worked with clients at my prior employer IBM, the data challenges in these three areas kept rearing their heads. The realization that one needs to expand IT transformation efforts into the data transformation space is dawning upon most organizations as they mature. Data friction, data gravity, or its superset data inertia (with apologies to Sir Isaac Newton) are becoming the next impediment that needs to be addressed to make any of these transformations successful at scale.
Delphix has solutions that I truly believe can address all these areas. We (ahem) at Delphix call this space DataOps. Ah yes, yes, another spin on the DevOps buzzword. But bringing business value to end-users and clients is the eventual goal, and it requires Dev+Data+Sec+Ops all to be transformed.
This week marks my start at Delphix as the Global Practice Director of Data Transformation. I was at IBM for over 15 years and 8 more before that at Rational Software. I have had the opportunity to be at the cutting edge of transforming application development and delivery for over 20 years. Starting with UML and the Rational Unified Process (RUP), to ClearCase and ClearQuest, to the Rational Jazz ALM tools, to DevOps with UrbanCode, to cloud-native applications with IBM Bluemix and of late, containers with IBM Kubernetes Service and IBM Cloud Private. It has been a great run. I now head into addressing what I believe is the next cutting-edge transformational area to address: DataOps.