Programmable Data Infrastructure

Tapping the Corporate Data Well

There's much chatter these days about data being the new oil, the focus of which is on the deep wells of crude, unrefined data housed by the likes of Google, Facebook, and Amazon.

Brett Stevens

May 10, 2017

There's much chatter these days about data being the new oil, the focus of which is on the deep wells of crude, unrefined data housed by the likes of Google, Facebook, and Amazon. Whereas they are correct about data being the new frontier, the prevailing dialogue unintentionally ignores the significant - and largely untapped - opportunity to innovate with data that lives within existing legacy systems.

Traditional approaches to providing data to development and testing teams--sometimes coined "Test Data Management" (TDM)-- have had little success to facilitate the flow of data between legacy systems and systems of innovation. Most IT organizations still have slow and indirect processes, which has resulted in untapped supply (inaccessible data) and limited demand (developers that resign themselves to the belief that it's just the way it is).

The data supply chain usually looks something like this: application teams define their requirements, a TDM team manufactures data to meet those requirements (sometimes using tools to subset, synthesize, and mask data), and then a team of IT administrators delivers that data to a non-production system. Finally, application teams consume that data, and the process repeats.

The biggest bottleneck in the data supply chain is data delivery. Due to the multiple manual hand-offs between administrators, it can often take days for the process to complete, and in some cases, result in downtime for development teams.

There's a huge opportunity to free up an illiquid supply of legacy data to fuel front-end development and analytics by focusing on the data delivery constraint. AirBnB, Uber, and Netflix are examples of companies that innovated through disintermediation--by removing the middleman between hosters and renters (hotels), freelance drivers and riders (taxis), and movies and movie-watchers (video rental shops), respectively. Similarly, IT organizations can remove the middleman between data and developers (manual data delivery) with a modern data platform that allows for automated, self service access to data.

Bloor suggests that the trend towards greater adoption of DevOps principles, in particular, is driving the need for a mechanism to manage data without the intervention of operations personnel. Fannie Mae is a great example. Removing the middleman in data delivery as part of their DevOps approach has allowed one development team to reduce their release cycles from twelve to just two weeks.

Don't forget about the data in your legacy applications and databases. It's as much part of the new world as it is part of the old. Companies that reshape the TDM function into a broader platform for data will have a significant opportunity to innovate faster than those who don't.