Blog

Resumable Virtual to Physical

One of the unique features of the Delphix Engine is our Virtual to Physical (V2P) functionality.

One of the unique features of the Delphix Engine is our Virtual to Physical (V2P) functionality. Most of our features have to do with sucking data in to Delphix and then manipulating & delivering the specific the data you want, from any point in time, rapidly and easily. V2P is a bit different: it allows customers to copy or 'export' data out of Delphix, from any point in time, and then bring up a physical database.

Why is this useful? Since Delphix stays in-sync with the production database, V2P can be used as part of a continuous data protection strategy for production environments. More concretely: imagine you are one of our many customers who use Delphix to accelerate application development by delivering virtual copies of production environments quickly & easily.

One of the last pieces of the development process is performance testing. V2P is used to create a physical database used for benchmarking during UAT before deploying the new application code into production. Now, this all sounds great so what's the problem? The reality is that exporting a large, multi-terabyte database across a network takes time and is at the mercy of all manner of unpredicted outages and hiccups, especially with the network itself.

The chances of an export of a 50TB database over an unreliable network completing without interruption become vanishingly small. Customers can't control the minute-by-minute reliability of their often large networks -- or other hardware resources -- so we turned our attention to improvements in our product to compensate for shortcomings or instability in the environment.

As described earlier, in Delphix Engine 4.0 we re-implemented V2P on top of DSP -- Delphix Session Protocol -- to enable significantly faster data transfer, up to 4x faster in fact. (Shameless plug: DSP is our purpose-built, open source network protocol. Download the code, check it out, and see if it can help you with any network-intensive or distributed applications you are building.)

Moving the data faster across the network certainly helps -- it shrinks the window of vulnerability. But network flakiness or outages and problems with other resources -- the target server, the hardware Delphix itself runs on, etc. -- can still upend an otherwise perfectly good export. What if the target server runs out of disk space or is rebooted? What if the Delphix Engine itself or the VM it resides on is rebooted?

Our customers asked if we could do more to guard V2P against the whims of the environment. We listened. In the recently released Delphix Engine 4.2, I improved V2P by adding a degree of robustness to address common failure modes seen by our customers. We call it Resumable V2P.

First, I built a checkpointing mechanism to continually track how much of each database file has been transferred; the checkpoints are stored on-disk on the Delphix Engine. This enabled V2P to track what data it had transmitted already. Second, I built new infrastructure to enable the V2P logic to flag what it considers recoverable errors -- errors from which the export could resume without needing to restart the time-consuming data transfer from the beginning.

For V2P, recoverable errors are network outages (the most common outage type experienced by our customers), out-of-disk space on the target, and Delphix Engine or target host outages. When one of these errors is detected at run-time, the infrastructure automatically suspends the V2P operation and displays feedback in the UI as to what error condition was hit. When the user sees a suspended V2P in the UI, they can address the error condition, i.e., plug in the loose network cable or free up disk space.

The UI also provides a new "resume" button: once the error has been fixed, the user can click the button and the V2P logic will read the checkpoints to determine where to resume the data transfer from and do so, without re-transmitting data previously sent. For myself, this work highlights a couple of things we strive for in engineering @Delphix: build products that solve real-world problems and be responsive to our customers' needs.

Our customers asked what we could do to compensate for instability in their environments when running an export. Our first step was to improve performance dramatically by rebuilding V2P to use DSP. Our next step is resumable V2P, which we feel is a major step forward in this area and one we feel will increase the value our customers realize when using Delphix.

The feature is in the 4.2 release, which we just announced, so if you are already a customer, all you need to do is to upgrade to use it.