Why Docker Is Not Enough

Testing changes against a production app is hard. This is because cloning can be a very cumbersome and slow process.

Testing changes against a production app is hard. This is because cloning can be a very cumbersome and slow process. Common processes for cloning an app are snapshot/restore a vm, tar/scp/untar files, and backup/restore database(s) and any of these steps may also require you to go through IT. This problem is compounded in data driven apps which must be cloned more frequently to prevent the buildup of stale test state. Testing against production may seem unjustifiable, but when time and manpower are tight, it happens. Using Docker and Delphix, we have investigated a simple and fast process to clone production apps that is generally applicable to a variety of applications such as:

For our investigation, Chris Kast, Dan Tehranian, Rahul Nair, Srini Dandu and I improved the cloning workflow for JIRA, a defect tracking tool.

Overview

A common JIRA installation consists of three pieces: a PostgreSQL database, a JIRA_HOME directory, and the JIRA binaries. The majority of the bug data is stored in the database while some bug metadata such as screenshot attachments are stored in the JIRA_HOME directory, along with initial configuration files. Cloning JIRA requires 2 broad steps:

  1. Setting up an environment with a JIRA user and the necessary binaries

  2. Replicating both data sources and stitching the configs together

This took hours.

Solution in a Nutshell

With Docker and Delphix, we could Provision a JIRA clone of production in minutes. Difficult tasks such as Upgrade Testing became extremely easy and reliable. Alongside cloning, the technology combination gave us the ability to Refresh, Snapshot, and Rollback each virtual JIRA application.

Docker and Data Docker is an amazing tool that wraps Linux Containers (LXC) with a developer friendly API and configuration management. It allows devs to spin up consistent runtime environments with all the dependencies so that apps just work....and it's fast! All containers share the same OS kernel and allow new containers to be created in less than a second.

A simple "docker run jira" builds a base environment with a clean JIRA test app. But how useful is a blank JIRA instance? You need real data to validate expected behaviour and hitting the edge cases in production are what can typically lead to a failed update/upgrade. In other words, You don't hit bugs on a clean installation! Chris wanted to validate his changes against a test instance, and to do that he needed a production-like environment with real data.

Persistent storage is an active area of development for Docker. Docker data can either be private (live inside a container) or shared (live on the localhost). Private data lives on Docker's union file system. Docker's storage backend abstraction is based on layers, and has several implementations such as vfs (directory based implementation where creating a child layer is the equivalent of creating a child directory and deep-copying the parent) to btrfs (a filesystem that uses snapshots to implement layers).

You can find a good overview of the backend implementations here. We had two data sources that we wanted to share between containers, and also keep in sync with production. We could have jumped through hoops to create containers that are responsible for keeping the JIRA_HOME directory and PostgreSQL database in sync; but even so, Docker has no concept of sharing that data with other test containers such that each container gets its own read-write copy of the data. Delphix is made for this.

The Delphix-Docker Workflow

Using Delphix, we linked both the production PostgreSQL database and JIRA_HOME directory pulling the data into the Engine. We then provisioned these sources as a Virtual Database (VDB) and Virtual Files (vFiles) to a new host(s); using our Toolkit Hooks, we automated the building of a JIRA container and the config changes required to stitch it all together.

Hours to Minutes

Using the Delphix GUI alone, it took a several hour long cloning process down to just 3 minutes... and 2 minutes are spent waiting for the JIRA application to start up! The developer experience is using the Delphix GUI to provision out the VDB (to any host, it can be accessed over the network via its JDBC url) and provision the JIRA_HOME vFiles to the Docker host.

For the Curious: How We Did It and the Issues Along the Way

Toolkit Internals

Most of the process was automated through the use of a custom Data Platform Toolkit. The resulting process was:

  1. Provision a PostgreSQL Virtual Database (VDB) through the Delphix GUI

  2. Provision Virtual Files (vFiles) of the JIRA_HOME directory to the Docker host through the Delphix GUI

  3. Configuration changes to the virtual JIRA_HOME (automated through Toolkit Hooks)

    1. Update the JDBC connection url, database name, user, and password for the PostgreSQL VDB in the dbconfig.xml file

    2. Remove the .jira_home.lock file

  4. Run the Docker container (automated through Toolkit Hooks) Mount the vFiles into the container as the JIRA_HOME directory.

See the Appendix for the Dockerfile and docker run command used.

Issue 1: NFS and the Docker Daemon

In the aptly named article "NFS shares and volumes don't mix", we found a problem with the Docker daemon not registering newly created NFS mounts, leaving an empty bind mount in the container. One solution was to restart the Docker daemon, forcing a parsing of existing mounts. However, because we were lucky enough to be on CentOS which uses systemd (a suite of basic building blocks for a Linux system), there was a MountFlag option available when starting the Docker service. We opted for the MountFlag=private fix. The MountFlag option controls the visibility of Docker mounts with respect to the global mount namespace (some more information here), but in general very little information was found describing this flag.

Issue 2: NFS and Docker Privileges

We hit permissions issues with the bind mount in the container. Using Docker's privileged mode, which exposes many kernel features and device driver access to a container, we were able to get around this on the devicemapper backend (other backend storage implementations may behave differently). Yes, we lost some of the nice security advantages in privileged mode, but its all good with test/dev.

APPENDIX

Dockerfile FROM durdn/atlassian-base
MAINTAINER Coen Hyde <coen.hyde@gmail.com>
ENV JIRA_VERSION 6.3.15 RUN curl -Lks http://www.atlassian.com/software/jira/downloads/binary/atlassian-jira-${JIRA_VERSION}.tar.gz -o /root/jira.tar.gz
RUN /usr/sbin/useradd --create-home --home-dir /opt/jira --groups atlassian --shell /bin/bash jira
RUN tar zxf /root/jira.tar.gz --strip=1 -C /opt/jira RUN chown -R jira:jira /opt/jira
RUN echo "jira.home = /opt/atlassian-home" > /opt/jira/atlassian-jira/WEB-INF/classes/jira-application.properties WORKDIR /opt/jira EXPOSE 8080 USER jira CMD ["/opt/jira/bin/start-jira.sh -fg"]

Docker Command

docker run --privileged=true -p ${PORT}:8080 -v ${vJIRA_HOME}:/opt/atlassian-home jira/jira