Blog

Resurrecting the ZFS test suite

At the recent illumos meetup we commemorated the 10th anniversary of ZFS.

At the recent illumos meetup we commemorated the 10th anniversary of ZFS. During the event, I spoke about how Delphix has been using the ZFS test suite, and our plans for incorporating it into illumos. The talks were filmed, so you can see the video of the ZFS test suite talk if you weren’t able to attend in person. You can also check out the presentations given by Matt who talked about libzfs_core and Chris who spoke about feature flags and backwards compatibility testing with ztest.

For those who are unfamiliar with it, the ZFS test suite is a series of almost 1000 tests designed to uncover bugs and prevent regressions in ZFS. It was open sourced in late ’09, and hasn’t been updated – until now. Since we began using it at Delphix, it has already uncovered several bugs, including a pair of kernel memory leaks, the inability to use privilege delegation for volumes, a panic and more. Our ultimate goal is to get zfstest back into fighting shape, and integrate it into illumos where the increased frequency of test runs throughout the community can have the most benefit to ZFS. There was some interest from the audience about the ZFS test suite, so what follows is a quick start guide to adding your own tests to the test suite.

The first order of business is to grab a copy of the source, which is available on github:

$ git clone git://github.com/delphix/zfstest.git
Cloning into zfstest...
 
...

 

The build process for the four packages in this repository is relatively straightforward, and documented in the README found at the root of the workspace. Assuming a fully built workspace, the next task is to create a home for our new tests. Because STF (Solaris Test Framework) builds the test suites written for it in addition to running them, very little is required in the way of setup. For the purpose of this example, let’s create a test that verifies zfs list testpool/testdataset exits with a non-zero return value if the dataset does not exist. Many tests exist along this line, but let’s assume this is a new class of tests that will live in a directory of its own:

$ cd usr/src/suites/fs/zfs/tests/functional
$ mkdir new_list_tests

 

STF will recurse into this new directory automatically when building – there’s no need to change any of the build infrastructure in the directories above new_list_tests. The Makefile for your new test is (nearly) completely comprised of environment variables that will be consumed by STF:

  • STF_ROOT_CONFIGURE
    • A script run when stf_configure recurses through this directory when initially configuring the test suite. This should be left blank.
  • STF_ROOT_SETUP
    • A script run prior to each test to put any required test objects in place.
  • STF_ROOT_CLEANUP
    • A script run after each test to destroy any objects created by setup or the test itself.
  • STF_ROOT_TESTCASES
    • A list of the tests in this directory.
  • STF_ENVFILES
    • A configuration file with environment variables available to the tests in this directory.
  • STF_INCLUDES
    • A shell library with functions available to the tests. (The tests needn’t source this file directory)
  • STF_DONTBUILDMODES
    • This variable tells STF whether or not it should build the tests in various (e.g. 32 and 64 bit) modes. Since these are shell scripts, this should be set to ‘true’

The last line of the Makefile will include a stock Makefile provided by STF. All of the variables above that begin with STF_ROOT have analogs in the form of STF_USER. The difference being the uid under which the test process will run. Since this test should run as root, the end result of your Makefile will look something like this:

STF_USER_CONFIGURE=
STF_ROOT_CONFIGURE=
STF_ROOT_SETUP=setup
STF_USER_SETUP=
STF_ROOT_CLEANUP=cleanup
STF_USER_CLEANUP=
STF_ROOT_TESTCASES=list_test_neg
STF_USER_TESTCASES=
STF_ENVFILES=
STF_INCLUDES=
STF_DONTBUILDMODES=true
include $(STF_TOOLS)/Makefiles/Makefile.master

 

The implementation of setup.ksh and cleanup.ksh will be left to the reader, but the setup script should create the pool if it isn’t there and destroy the FS if it is. The cleanup script should ensure the system is returned to the same state it was in prior to running setup. Most of the config, setup and cleanup scripts found throughout the suite are very similar, so check out the other test directories for inspiration.

With the environment, setup and teardown out of the way, all that remains is the test itself. A stripped down version of list_test_neg.ksh might look like this:

#!/usr/bin/ksh -p
. $STF_SUITE/include/libtest.kshlib
log_assert "Destroying a non-existent FS fails"
log_note "Attempting to destroy an FS which does not exist"
log_mustnot $ZFS destroy $TESTPOOL/$TESTFS
log_pass "Destroying a non-existent FS fails"

 

The log_assert and log_note shell functions above are informational. log_mustnot (and its sibling, log_must) log the output of the command in question, and verify that the command returns success or failure, as expected. The last function marks the test as passed in the log. All of these functions can be seen in logapi.kshlib. Simply run stf_build package install your new package, and run your new test. If stf_configure was previously run, it will need to be run again to tell STF about the new test.

When we first ran the test suite, there were 200+ failures; a number we’ve since winnowed down to 65. Now that you know how to make your own additions to zfstest, why not pick up a hammer? The tests are small, and fixing them is a great way to make a contribution to the future of ZFS and illumos. For the motivated, there’s a current list of known test failures on the wiki associated with the github repository. If you run into problems, you can always drop a line to me and/or the illumos community ZFS list.