Blog

Delphix Continuous Data Integration

"You never develop code without version control, why do you develop your database without it?"

“You never develop code without version control, why do you develop your database without it?”

– Alex Soto

Applications are the nexus of the modern enterprise. They simplify operations, speed execution, and drive competitive advantage. Accelerating the application lifecycle means accelerating the business. Organizations depend on rapid iteration as the foundation of agile development. This had led to a rich ecosystem of source code management tools that enable developers to work quickly within a private sandbox and push changes to a shared branch where continuous integration tools build and validate those changes on an ongoing basis.

This source code management agility has historically been stymied by data dependencies that are not easily moved, refreshed, or managed. This forces shared sandboxes, manual schema management, and stale data that degrades quality and slows development. Over the last decade, momentum has built around concepts such as Evolutionary Database Design and Database Continuous Integration, and an increasingly rich toolchain has emerged. While implementations vary, they share the same basic idea:

Structural (DDL) and data (DML) changes must be managed in concert with source code and applied in a rigorous and automated fashion.

These tools (flyway and liquibase being the most prominent) connect to arbitrary databases and manage the contents of the database, but make no attempt to manage the databases themselves. While they are capable of upgrading production data at any version, users typically operate on an empty database and populate it with synthetic data. While this is useful for unit tests that require repeatable results, it is a poor solution for functional, regression, and system test environments. Running those tests on synthetic, partial, or stale data can lead to bugs caught late in the development cycle, requiring costly process resets that could have been avoided if the developer had been using real data in their own sandbox. The result is longer application development cycles expensive process to manually push real data earlier into the lifecycle, and slower velocity for the business.

Continuous Integration with Delphix

Delphix is the engine that can accelerate these tools and provide a robust foundation for a new generation of continuous data integration processes. Database continuous integration tools provide the framework for DDL and DML change management, Delphix efficiently provides fresh real data to developers at the point they need it. Imagine a developer writing a DML translation based on the assumption about the format of some data. All of their manual, unit, and integration tests pass based on the synthetic data built on those same assumptions (or pulled from stale production data), but the application fails in UAT when confronted with an unknown format on the latest production data.

Bug Impact to Test Cycle Before Delphix

test cycle before delphix

Bug Impact to Test Cycle After Delphix

test cycle after delphix

Before Delphix, these bugs force the project team into an unpleasant situation: re-run the entire test cycle and lose weeks of development time, or run only a subset of tests after fixing the issue and jeopardize quality by risking bugs that might only be caught through a full test cycle. After Delphix, no one ever notices a bug that never escapes the developer environment.

Database continuous integration tools like flyway and liquibase are a perfect match for Delphix-driven provisioning and refresh. Each of these tools tracks ordered transformations using a table within the database to track what changes have been applied. This allows the software to bring any database up to the current developer version. There are no additional hooks required on the Delphix side, but there are ways Delphix can provide benefits beyond just fresh real data.

Developer Reset

These DDL and DML transformations are not foolproof. In some cases, it is necessary to write custom product code to run transformations that cannot be accomplished via SQL (such as updating BLOBs), or assemble a particularly tricky set of SQL statements. The developer should be able to test and validate these transformations on real data prior to integrating the changes. With Delphix, developers can tag their branch after all the known transformation have been run, and then rapidly iterate on validating new transformations with real data, resetting on failure without needing to refresh or get a new copy of real data.

developer reset

Project Branches

Developers rarely work in isolation. Project or release branches allow developers to share code changes by pushing to a common source code repository. With Delphix-enabled continuous data integration, it’s possible for developers to share not just a source code repository, but a data repository as well. While continuous integration tools can apply changes within each developer copy, these transformations may time to run and consume additional storage depending on the size of the database and nature of the transformations. These changes can be automated within a project branch such that developers always get the latest transformed data whenever they refresh.

On refresh of a project branch, a script is invoked that transforms the production data using the latest DDL and DML changes pulled from the project source code repository. Combined with a refresh policy, this can keep the project branch up to do date with fresh data without each developer needing to run the same transformations. This process can also be run through hooks whenever a DDL or DML change is pushed to the source code repository.

project branches

Summary

Agile development requires agile data, and Delphix is the agile data engine of the future. You can sync to your production or source databases, instantly provision virtual databases where they are needed using a minuscule amount space, and provide each developer an independent copy of the data that can be refreshed and reset on demand. Combined with database continuous integration tools like flyway and liquibase, Delphix simplifies the development and testing of structural and data transformations. Providing efficient full data copies through Delphix that are always in sync with source code changes reduces bugs and accelerates application delivery while reducing IT and infrastructure overhead.