Blog

DelphixBench 1.0

DelphixBench is an internal benchmarking framework developed to measure performance of various workflows in Delphix.

Introduction

DelphixBench is an internal benchmarking framework developed to measure performance of various workflows in Delphix. The benchmark is designed to produce repeatable results in a consistent manner. This benchmark focuses on the workflows which are most relevant to our customers to ensure that the benchmark metrics are representative of customer experience.

Motivation

There are various tools, micro-benchmarks available to us to measure different aspects of the Delphix ecosystem. For example, FIO is a tool that can be used to measure IO and file-system performance. Other such tools exist today to evaluate performance, but those tend to not reflect our customers’ real-world performance either because they are micro-benchmarks or because they are irrelevant macro benchmarks. DelphixBench is an attempt at bridging that gap. DelphixBench also addresses the following needs:

  1. A comprehensive framework for measuring performance of workflows relevant to our customers.
  • Ensure a continued uptrend in workflow performance.
  • Provide an efficient mechanism to evaluate performance trade-offs.
  • Provide a tool to showcase our performance.

Benchmark Workload

DelphixBench is built on Swingbench, developed by Dominic Giles. DelphixBench uses the Order Entry schema from Swingbench, with the following three workload types:

  1. OLTP
  • This is the standard Transaction Processing style workload, which includes a mix of Browse, Query, and Update transactions.
  • This workload mimics TPC-C.
  • Read
  • This is a custom workload developed to stress caching and prefetching aspects of the Delphix ecosystem.
  • The majority of transactions in this workload are Query and Browse.
  • This workload is sensitive to datafile read bandwidth, and latency.
  • Write
  • This is a custom workload developed on top of the Order Entry workload.
  • The majority of the transactions are “New Order,” which update the Order tables.
  • This workload generates a large amount of redo/log traffic, and is sensitive to redo/log write latency.

These three workloads are run against three different databases, resulting in nine benchmarks. All databases use the Order Entry schema, but are 1GB, 10GB and 60GB in size. The three databases (SOE1G, SOE10G and SOE60G) are populated apriori. Every benchmark run will start from a pre-defined snapshot of the database.

Benchmark Operation

The benchmark run is completely automated through the internal “blackbox” framework. Users can either choose one of the existing Delphix appliances to run the benchmark against, or point to their own appliance.

The benchmark operation will have an option to either set up Delphix from scratch, or use an existing stack. The following steps are carried out for each of the test cases.

  1. The source databases are obtained from their snapshot and are “linked” to the Delphix server as dSources.
  2. For each of these dSources, 12 different Virtual Databases (VDBs) are provisioned onto two target systems.
  • Using 12 VDBs to generate enough load on server, to ensure consistent results.
  • Once VDBs are provisioned, three types of load is targeted on them, measuring Transactions Per Second.
  • Other metrics are measured from different work flows, as described in Benchmark Metrics.

Benchmark Metrics

Benchmark metrics are defined by measuring typical workflows performed on the Delphix appliance by our customers. Emphasis is placed on those operations which provide the most value to customers.

Linking Performance

Delphix server allows users to ‘link’ their databases in order to then create/provision virtual copies efficiently with minimal overhead. The amount of time it takes to link a fresh database into a new datasource is used a metric of linking performance. This time is closely dependent on the amount of data that needs to be consumed, which includes the size of the database plus the redo/log data. The metric will represent total time to link the database normalized by the size of database plus log activity since the snapshot. This metric is reported for all three databases.

Provision Performance

This benchmark operation provisions 12 VDBs in parallel for each source database. All the VDBs are provisioned asynchronously.The time to finish provisioning all the VDBs is measured and normalized by the number of VDBs. Provision time is typically dependent on the amount of online log data in the source database at the time of provisioning. Since this is constant across all the databases in the benchmark, absolute time to provision is used as the metric. Absolute time also emphasises independence of provision time to database size or type of workload.

VDB Performance

Once the VDBs are provisioned, three different workloads, OLTP, READ, and WRITE, are run against them. Each run uses a separate set of VDBs. The performance metric evaluated here is the Average Transactions Per Sec, which is measured by Swingbench. Transactions per second is a popular metric relevant for OLTP workloads. Each run will produce a separate TPS rating.

Replication Performance

Replication service enables continuous user data replication from one Delphix server (source) onto another (target). The target Delphix server can be used as a backup server in case of any catastrophic failures. The benchmark operation measures both full and incremental replication. At the end of the first benchmark run, the Delphix server with the source database, plus the 12 VDBs, are replicated onto the target server. After every subsequent test, more data sources are added, and the server replicated. Note that subsequent replications are all incremental. The benchmark operation measures and reports the results of all replications.

Conclusion

DelphixBench is designed to measure performance of various operations of the Delphix server that our customers typically perform. I used a popular oracle benchmark as the basis for the workload, to ensure representative results. Metrics are defined to be representative of the performance relevant to our customers. As newer features are added, we will incorporate those workflows into the benchmark.