Blog

Docker replication using Delphix Application Data

The open-source Docker platform has recently been making waves in the virtualization and cloud work for its write-once, run anywhere style of distribution.

The open-source Docker platform has recently been making waves in the virtualization and cloud work for its write-once, run anywhere style of distribution. Docker gives users the ability to create a specific container (environment) for applications and then easily transfer that container to other machines. This transfer can typically be done in two ways: the user can either commit their container to a Docker registry and download it into the new machine or they can move the current container directly between machines (tar/scp/load).

While these methods are effective, the time it takes to build a new image and/or transfer the existing image can significantly slow the deployment process especially if there are constraints on bandwidth. Given this limitation I decided to investigate if Delphix could be used to streamline the deployment of Docker containers.  

Delphix Application Data  

One of the new capabilities added to Delphix 4.0 is the Application Data feature which expands its data management functionality to beyond the database. Similar to how the Delphix Engine manages database data, you can now use the Application Data feature to link an arbitrary directory of files and provision a virtual copy of the directory to a target environment.

Each virtual copy uses almost no additional space on the Delphix Engine and requires minimal time to provision. Delphix also provides hooks to orchestrate customizations during both snapshotting and provisioning. This rich feature set made Delphix Application Data the obvious starting point for an integration with Docker.  

Linking an Unstructured Files dSource  

The first step of the integration was to snapshot the Docker image/container store in its entirety. In order to do this I set the Docker storage location (typically /var/lib/docker) to be an Unstructured Files Application Data dSource within the Delphix Engine. This allowed me to both take manual snapshots and set the SnapSync policy to take as many automated snapshots as needed.

Since Docker stores its entire state inside the storage location, by copying the entire folder we are able to save the entire environment state including image, containers, links, layers, etc...One caveat specific to devicemapper storage backend (Docker default on Red Hat Enterprise Linux) is that the snapshot can take a few minutes due to the way data is stored. Devicemapper provisions a single 100GB sparse file to serve as the "pool" to store the individual  containers and images. While Delphix is intelligent about not saving the entire pool, it still needs to inspect the file which may take a little longer than expected.  

Running Directly on Delphix vFiles  

I also tried an alternative method of snapshotting by installing and running Docker containers directly on Delphix vFiles (the filesystem equivalent of a Delphix Virtual DB). I began by creating an empty folder on the source machine and linking it as a dSource inside Delphix using the Unstructured Files feature of Application Data. I then provisioned a vFiles from the empty dSource into the /var/lib/docker location on the same host machine.

This created what is essentially an empty NFS mount on the Delphix Engine. I then started the Docker daemon with the new vFiles location set as the storage location for Docker. This meant that any images and containers created by the Docker daemon would be stored directly on the Delphix Engine. Installing Docker directly on Delphix vFiles gives me several advantages over the first method.

The primary advantage is in leveraging the ability the Delphix File System (DxFS) to take instantaneous snapshots of Docker regardless of the specific storage driver used by Docker. Additional advantages include zero network overhead during the snapshot process, reducing the storage needed on the source machine and minimal storage overhead on Delphix as the snapshot data would have to be saved to Delphix regardless.  

Provisioning into Target Machine  

The next step was to provision the newly created snapshot from the source into the target machine. I started by locating the correct source snapshot inside the Delphix TimeFlow UI and provisioning it to the /var/lib/docker location in the target machine. I could now start Docker daemon on the target machine and it was able to access the images and containers directly off the snapshot without requiring any additional loading steps.

I then restarted the containers which exactly replicated the Docker instance on the source machine including the same port numbers, external volumes and inter-container linking. The orchestration hooks provided by Delphix made the process even smoother by allowing me to automatically stop and start the Docker daemon during the refresh. This allowed the daemon to automatically restart the previously running containers and further minimize the downtime during refresh.  

Root Squashing  

One issue I did run into while running Docker from the vFiles is that the Docker daemon must be run as root which is not possible due to the default root squashing behaviour on the Delphix engine. This is something that I was able to turn off in a development version of Delphix but can also be done by manually. Customers using the current Delphix Engine can work with Delphix Professional Services to have root squash disabled on their vFiles.  

Conclusion  

The results of my investigation show that it is possible to replicate Docker instances across multiple machines in seconds using the Application Data feature of Delphix Engine. The advantages of this system include the simplicity of not having to write dockerfiles or manage a private Docker registry while also being faster than both pulling layered images and pushing compressed images over the network. The orchestration features provided by the Application Data toolkit minimize the downtime during refresh as well as enabling more complex backup and restore procedures if necessary.

The primary downside of this approach is that replication is only possible to substantially similar machines (OS, Docker version, storage backend, etc...) which reduces some of the versatility of Docker containers. However most enterprise environments are fairly homogeneous which means that the positives of simple and fast deployment could outweigh the lost versatility.

Users can also use Docker registries along with Delphix replication which would give them the ability to switch between replication techniques as needed. It is a testament to the agility of the Delphix platform that I was able to get such an integration in just two days without having to make any code changes to either Docker or Delphix. I hope you found this blog post interesting and that it has given you some new ideas on what can be done with Docker and Delphix working together.

If you're a Delphix customer of the Application Data feature, a quick touch point with Delphix Professional Services can get you up and running with Docker. If you have not yet tried Application Data, your sales rep can give a more comprehensive overview. If you have any other ideas on how to use Docker, the Application Data feature or Delphix in general please do leave us a comment or get in touch with your sales rep directly.

We would love to hear how you manage Docker backup/replication internally and work with you on integrating Delphix into your specific use cases. The Delphix Application Data feature can be used in a variety of ways to streamline workflows and reduce the friction that is often present in delivering data. This Docker use case is just one easy example of that and in the coming weeks we will be posting more examples of how Delphix can help improve process and increase agility within an organization.