Proposed ZFS Feature: Channel Programs

ZFS provides a huge number of powerful features (snapshots, send/recv, filesystem properties, clones, etc). Powerful applications and utilities have been built on top of these low-level ZFS atomics, though it isn't always easy.

ZFS provides a huge number of powerful features (snapshots, send/recv, filesystem properties, clones, etc). Powerful applications and utilities have been built on top of these low-level ZFS atomics, though it isn't always easy.

While each individual operation is simple, straightforward, and easy to make use of, the correct and rapid implementation of efficient and higher level ZFS operations which utilize multiple elemental ZFS operations (destroy, snapshot, etc) can be difficult or impossible. To illustrate this point, let's take a look at a few examples.

First, let's consider some existing ZFS functionality: zfs destroy -r, or recursive destroy. Recursive destroy destroys a filesystem and all of its child snapshots. The actual implementation of recursive destroy takes three steps: gather all snapshots of a filesystem, destroy those snapshots, and then destroy the filesystem.

Each of these steps calls into the kernel, and while each is atomically consistent on its own the aggregate recursive destroy could fail in an intermediate step. For instance, consider the case where the gathering and destroying of child snapshots succeeds, but another process creates a snapshot of the target filesystem before the actual, final destroy operation is issued.

In this case, the recursive destroy will fail because a snapshot for that dataset still exists. Not only will it fail, but it will return with an inconsistent state which is neither the original nor the desired state. Even using libzfs_core wouldn't permit a correct implementation of recursive destroy.

What about creating new ZFS operations? 'zfs destroy -p' is a planned addition to ZFS which destroys a filesystem and all of its snapshots, same as 'zfs destroy -r'. However, if any of the snapshots to be destroyed has a clone, 'zfs destroy -p' promotes the latest clone first.

This command has all the consistency problems that recursive destroy does and so correctly implementing it will require some additions to the kernel. However, as it is a new feature we may want to change its semantics as we are prototyping.

For example, while our original design destroys all snapshots and promotes their clones, if any, what if we decide to instead create and promote clones of any snapshots which do not have pre-existing clones? That would require major modifications and additions to any existing operation-specific kernel code we had already written.

Performance of multi-operation ZFS programs can also suffer, as separate operations are placed into different transaction groups (TXGs) whose commit to disk are separated by seconds. For instance, a 'zfs destroy -R' which obliterates the entire snapshot and clone tree beneath a filesystem has to iteratively submit collections of snapshots and clones to the kernel for deletion as they become valid to delete (i.e. as their own subtrees are deleted).

For large trees of snapshots and clones, this means a single 'zfs destroy -R' can span many TXGs and lead to a less-than-responsive CLI or a poorly performing application. Fortunately, a proposed OpenZFS project, ZFS Channel Programs (ZCP), takes a rather unique approach to supporting efficient, correct, and rapid implementation of composite ZFS operations.

At a high level, a ZFS Channel Programs is a collection of elementary ZFS operations issued to the ZFS kernel module to be run in a single, atomically-visible operation. While the high-level programming interface is still a question mark, the object passed to the kernel will be a tree of control statements and ZFS operations.

For instance, a channel program could be created to recursively destroy a filesystem by nesting a destroy_snapshot operation within an iterate_over_snapshots control statement, followed by a destroy_filesystem operation. Because each of these statements would be evaluated from within the kernel, channel programs can guarantee safety from interference with other concurrent ZFS modifications.

Executing from inside the kernel allows us to guarantee atomic visibility of these operations (improved correctness) and allows them to be performed in a single TXG (improved performance).

A successful implementation of ZCP will:

  1. Support equivalent or improved functionality and performance for all of the existing ZFS commands.

  2. Facilitate the quick addition of new, useful, and more powerful commands by simply writing a new channel program. Previously this would have required modifications to the kernel. Since the ZCP layer guarantees the atomicity of each channel program, we no longer need to write elaborate and error-prone new kernel code for each new IOCTL.

  3. Provide the same safety guarantees and permission protections that ZFS does currently so that ZFS users don't need to worry about corrupting ZFS state with poorly written channel programs. Additional syntax and structure checks of channel programs will help support this.

  4. A fully backward-compatible ZCP kernel interface. Passing channel programs as nvlists should go most of the way towards achieving this goal.

If you think ZFS Channel Programs are as exciting and interesting as I do but want a deeper dive into the technical details, I'd recommend taking a look at the project proposal on open-zfs.org: ZFS Channel Programs Proposal. As this is a proposed feature, we encourage any thoughts, comments, or discussion on the developer@open-zfs.org mailing list.