Platform

A High Performance Architecture for Virtual Databases

Delphix completely changes the way companies do business by removing critical bottlenecks in development productivity.

Delphix completely changes the way companies do business by removing critical bottlenecks in development productivity. Gone are the days when a simple database clone operation from production to development takes hours or even days. With Delphix database virtualization, cloning of Oracle databases becomes a fast and practically free operation. Additionally, infrastructure costs can be drastically reduced due to decreased disk utilization in your cloned environments; in fact, the more environments you deploy, the more cost savings you will realize. The old adage is that your options are Fast, Good, and Cheap; pick two of three. But as a recent performance study between Delphix and IBM shows, all three options are within reach.

  • Fast database deployments due to Delphix database virtualization technology

  • High performance results through shared block caching in Delphix memory

  • Reduced costs through compression, block mapping, and other Delphix core capabilities

Delphix and IBM partnered to design a high performance virtual database implementation that could be used for reporting, development, quality assurance (QA), and user acceptance testing (UAT) at reasonable costs. In the end, the research shows that the virtual database environments provisioned with Delphix can achieve strong performance levels and scalability suitable for any organization. The average organization provisions multiple copies of a single production database which can include multiple copies of development, QA, UAT, stress testing, maintenance QA, and others. Each of these copies requires time to build and provision and the end result is several identical instances with their own copies of largely identical data blocks across several shared memory regions. Delphix achieves a great reduction is disk usage by only saving identical blocks to disk once. But more importantly, Delphix also only caches unique data blocks in memory. This means that read performance across multiple systems that are provisioned from a single source can show dramatic improvements and scalability.

Purpose of the Tests

Delphix is typically used to create non-production environments from production sources. With a Delphix powered infrastructure you can:

  • Enable high performance virtualized databases for reporting, development, and QA which improves productivity and decreases bottlenecks in deployment schedules

  • Dramatically reduce the cost of non-production environments

  • Reduce the impact of cloning and/or reporting on the production database system

  • Reduce the occurrence of application errors due to inadequate testing on out of date or partially populated development and QA systems

By design, Delphix has a small performance overhead because the I/O is supplied not over a dedicated fiber channel connection to storage but over NFS; however, by properly architecting the Delphix infrastructure, environments can be designed that actually run faster than physical deployments. On top of that, the environments come at a lower cost and as you will see from the research, performance actually gets better as the number of users increases.

Test Goals

In the tests run by IBM and Delphix, physical and virtual databases were tested under several scenarios to identify the effective performance benefits and how best to architect the Delphix server to maximize virtual database performance. The goal of the test was to both determine optimal configuration settings for the Delphix infrastructure and to show optimal I/O throughput on virtualized databases compared to physical database environments.

Test Environment

In order to properly test physical and Delphix virtual database environments, tests were performed concurrently at IBM Labs in Research Triangle Park, NC and Delphix Labs in Menlo Park, CA. Both environments used the same configuration and server hardware. Generally a standard physical Oracle deployment will involve a database host that connects directly to a SAN or other storage device, typically over fiber. As such the physical database test performed by IBM used a direct connection to SAN. For the Delphix tests however, the SAN is connected directly to the Delphix host via fiber and the Oracle host is connected to Delphix via NFS. This allows Delphix to act as a layer between the Oracle database server and the storage device.

high performance architecture 1

While this extra I/O layer may seem counterproductive, Delphix actually acts as a caching tier (in addition to its many other virtues). The presence of a properly configured Delphix server augments the storage subsystem and improves I/O performance on read requests from the Oracle host. The bigger and 'warmer' the cache, the greater the performance gain. And if SSD is used for your dedicated disk area, you can get more money out of your investment thanks to the single-block sharing built into Delphix.

Storage Configuration

For the purposes of this test, both the physical and virtual database testing environments used the same SAN. The SAN configuration consisted of a 5-disk stripe of 10K RPM spindles. Two 200GB LUNs were cut from the stripe set.

IBM Hardware

One of the main goals in this joint test with IBM was to find an optimal configuration for Delphix on hardware that could provide flexibility and power at an attainable price. To this end we chose the IBM x3690 X5 with Intel Xeon E7 chips; a system that is reasonably priced, powerful, and supports up to 2TB memory. Delphix does not necessarily require a large amount of memory, but the more there is available the better Delphix and your virtualized databases will scale with extremely fast response times. The test x3690 servers were configured with 256GB RAM. VMware ESX 5.1 was installed on the systems to serve as hypervisor for the virtual environments. The Delphix VM itself (did I mention Delphix can run on as a virtual guest?) was configured with 192GB RAM. Additionally, two Linux guests were created and configured with 20GB each to act as a source and target system respectively.

high performance architecture 2

In this configuration, the source database is the initial provider of data to Delphix. Following the initial instantiation, change data is incorporated into Delphix to keep the environment up to date and for point-in-time deployment purposes. The target system connects to Delphix via NFS mounts and can have a virtualized database provisioned to it at any time.

Load Configuration with Swingbench

Swingbench (a free database loading utility found at http://www.dominicgiles.com/swingbench.html) was used to load and measure transactions per minute (TPM) on the database host. 60GB data sets were used to populate the source Oracle database that filled a 180GB datafile inside the DB. Tests were run using standard OLTP Think Times. User load was varied for more realistic testing between 1 and 60 concurrent users. Each test ran for a 60 second window.

Database Configuration

The Swingbench database served as a test bed for the physical database servers and as the source database for Delphix virtual database provisioning. Delphix is capable of linking to multiple source databases. In this test, it connected and linked to a Swingbench source database. Once the initial link was made a virtual database was provisioned (extremely quickly) on the target database host. The Oracle instance on both physical and virtual systems were set up with a 100MB buffer cache, a fact that is sure to make some administrators cringe with fear. But the buffer cache was intentionally set to a small size to emphasize the impact of database reads. A large cache would not show improvement to database reads, but simply that a cache was in use. In order to show the true power Delphix brings to your I/O configuration, a small cache on the target database instance shows the work that the benchmark has performed at the read level. As such, I/O latency becomes the key factor in benchmark performance. Caching at the virtualization layer will produce good performance, but uncached data could result in very poor I/O results.

Network Configuration

In the virtual database environment provisioned by Delphix, the network is just as important as storage due to the way the target Oracle host uses NFS to the Delphix server's attached storage. The performance of NFS and TCP has a direct impact on the I/O latency of virtualized databases. To reduce the I/O latency and increase I/O bandwidth in the virtual database test, the Oracle Linux guest used for the target is on the same ESX server as Delphix. By keeping the Delphix tier and target system on the same hardware and VM infrastructure NFS and TCP communication can be done without additional physical network latency by running communications through a NIC. This architecture is known as Pod Architecture and it eliminates typical issues with networks such as suboptimal NICs, routers, or congestion. In the physical environment the network configuration was not important as processes and communications were all performed locally on the database host and storage on the SAN was accessed via fiber.

High Level Results

All tests were run twice - once with a completely cold cache (0 data in cache) and once with a warm cache (the full dataset has been touched and a working set of data in cache has been established). With a load of 10 users, the throughput measured in TPM was as follows:

  • Physical Database

    • Cold Cache - 1800 TPM

    • Warm Cache - 1800 TPM

  • Virtual Database

    • Cold Cache - 2100 TPM

    • Warm Cache - 4000 TPM

The implications of these tests are absolutely astounding. Not only was Delphix capable of provisioning a target database host in a matter of minutes, but also the virtual database outperformed the physical counterpart with both a cold and warm cache. With nothing in cache, the virtualized database performed very well, and additional caching increased performance dramatically while on the physical database it remained stagnant.

Detailed Results OLTP Tests

Performance improving by virtue of an increasingly warm cache is great, but there was much more the test revealed. During the course of the tests, it was found that an increased number of concurrent users (generally seen as a detriment to performance) improved performance even further. This level of scalability goes beyond standard environments where things like high levels of concurrency can bog down the database environment. Instead, higher concurrency improved throughput for scaling that comes with an exponential performance improvement. With more than five users on a cold cache and testing 1-60 users on a warm cache, the virtual databases outperform the physical counterpart. Cold Cache TPM Results by # of users:

high performance architecture 3

Number of users

Warm Cache TPM Results by number users:

high performance architecture 4

Number of users

Increased users (concurrency) brought greater performance benefits to the virtual database environment. Instead of degraded performance  as the number of concurrent users rose the tests conclusively showed dramatic improvements as more users were added to the testing process. As configured with a warm cache and 60 concurrent users, Delphix showed a TPM six times greater than the physical databases.

OLTP Load on Multiple Virtual Databases

In order to test the impact of multiple virtual databases sharing a single cache, two virtual databases were provisioned from the same production source. Tests were run similar to the previous testing exercise against:

  • A single physical database

  • Two concurrent physical databases using the same storage

  • A single virtual database

  • Two virtual databases using the same storage

In this test, we measured both the throughput in TPM and the average I/O latency. TPM vs. Single Block I/O Latency by number of users:

high performance architecture 5

Number of users

As you can see from these tests, as the number of a) database instances and b) concurrent users increased, latency degraded exponentially on the physical environment. As concurrency at the I/O layer rises, latency becomes a huge bottleneck in efficient block fetches resulting in flat or decreasing TPM. On the other hand, the Delphix environment flourished under increased load. Thanks to shared block caching, more users and more virtual databases meant dramatically increased TPM because of the lack of latency due to shared blocks. Just as in the previous test, the scalability of Delphix is entirely unexpected based on traditional Oracle scaling; as systems and users rise the environment as a whole not only scales, but also scales exponentially.Performance measured in seconds of a full table scan on the ORDERS table:

high performance architecture 6

Seconds

It is worth clarifying here that the impacts seen with caching on Delphix for a single database can of course be attained in any environment by caching more on the database host with an increased buffer cache. However, if there are multiple copies of the database running (remember the need for Dev, QA, Stress, UAT, etc.) there will be no benefit on a physical environment due to the inability of Oracle to share cache data. In a Delphix environment this problem disappears due to its dual function as a provisioning tool and I/O caching layer. Blocks in cache on one virtual database target system will be shared in the Delphix shared cache between all virtual databases. To show the impact of this functionality, throughput tests were run against two different virtual database targets. First against Virtual DB 1 with a cold cache, then Virtual DB 1 with a warm cache, following by Virtual DB 2 with a  cold cache and Virtual DB 2 with a warm cache. Performance measured in seconds of a full table scan  on customers, orders and order_items

high performance architecture 7

Seconds

Three queries were used for this test, all performing full table scans against different tables in each virtual database. Each query was run as previously described (Virtual DB 1 cold, Virtual DB 1 warm, Virtual DB 2 cold, Virtual DB 2 warm). The purpose of this test was to simulate standard Decision Support System (DSS) type queries across multiple virtual databases to show the effects of block caching in Delphix. We see that by warming the cache on Virtual DB 1 the query time dropped dramatically (which is to be expected). However, it is also clear that running the query on Virtual DB 2 with a cold cache retained the same caching benefit. Because Virtual DB 1 warmed the cache, Virtual DB 2 was able to utilize the fast I/O of cached data without any pre-warming. This behavior is the core of the exceptional results in the previous TPM tests when a higher user load and target database count was introduced.

Maximizing Resources and Performance

We have all been required at some point to create multiple databases on a single host, and each time it is difficult to decide exactly how to set each SGA to make the best use of RAM on the server. Make a single database instance's SGA too large and you take away critical assets from the other databases on the host. Make all of the instance's SGAs too large and you can seriously hurt the performance of all databases mounted by instances on that server. RAM is an expensive resource to use; not necessarily in terms of cost, but in terms of importance to the system and limited availability. By sharing the I/O cache between virtual database clones of the same source via Delphix, the RAM usage on the server is optimized in a way that's simply not possible on a physical database configuration. Delphix removes the RAM wall that exists in a physical database environment. Multiple instances are able to share data blocks for improved read times and increased concurrency capabilities. While it is possible to share caching on a SAN in a physical database configuration, remember that each database copy will use different cached blocks on the SAN. Additionally, SAN memory is far more expensive than memory on standard commodity x86 hardware. For example, the cost of 1GB RAM on an x86 server is around $30. On an EMC VMAX the same 1GB RAM will cost over $900.*  And that SAN caching will not carry with it all the additional provisioning benefits that Delphix brings to the table. Even though the tests were constructed to show maximum performance improvements due to a well-architected Delphix infrastructure, the impact of nearly any Delphix deployment will be dramatic. Slower hardware, smaller cache, or other factors can contribute to a less optimal architecture but the principle benefits remain. The actual minimum requirement for the Delphix host is 16GB RAM, but the average size among customers is 64GB. This is obviously not a small size but it is dramatically smaller than the rest of the test. In real use cases of Delphix at our customers' sites, we have found that on average 60% of all database I/O block requests to  Delphix are satisfied  by Delphix cache. This means that 60% of customer queries against their development, QA, reporting, and other environments provisioned via Delphix never have to touch the SAN at all. This relieves bottlenecks at the SAN level and improves latency for the I/O that actually does need to occur on disk.   *  http://www.emc.com/collateral/emcwsca/master-price-list.pdf Price obtained on pages 897-898: Storage engine for VMAX 40k with 256 GB RAM is ~$393,000 Storage engine for VMAX 40k with 48 GB RAM is ~$200,000, thus 256GB - 48GB = 208GB  and $393,000 - $200,000 = $193,000, So the cost of RAM here is $193,000 / 208GB = $927/GB.

Summary

In nearly every test, Delphix outperformed the traditional physical database, a shift that flies in the face of every 'performance scaling' fact we have held as true until this point. By breaking outside of the normal optimization configuration (physical database with cache connected to SAN with cache), we are introducing a multipurpose layer which provides incredible shared caching benefits beyond any that can be found on a solitary database host. Additionally, the more we threw at the Delphix targets the better it got. More databases, more concurrent users, all came back with better than linear scaling; dramatic gains were common as more work was required of the Delphix server. By implementing Delphix, the IBM x3690 we used for testing was capable of so much more than it could normally handle with the added benefit of cheap RAM as the cache layer and incredibly fast provisioning to boot. The architecture as a whole was significantly cheaper than a robust SAN caching configuration on purpose-built hardware while performing and scaling with dramatic improvements.

Other reading

The Delphix documentation is online at http://docs.delphix.com

Original article is at: http://dboptimizer.com/2013/04/03/a-high-performance-architecture-for-virtual-databases/