High Performance OLTP Applications in Cloud: Doing more with less
Last year, for Oracle Open World 2013, I published a study demonstrating that organizations can achieve a 10x improvement in price/performance for their Online Transaction Processing (OLTP) Applications with Delphix. I ran 26 Virtual Databases (VDBs), each at 35,000 transactions per minute, for an aggregate load of 910,000 transactions per minute from a single Delphix Engine. The reason for running 26 VDBs was, I ran out of CPU horsepower (servers) in my lab to run more VDBs.
A few months ago, we released an AWS port for Delphix. With a fully functional lab in AWS, I now had an infinite supply of compute resources at my disposal--for a price, of course.
The fact that I can add a large amount of compute, for a short duration of time, without paying a large upfront cost, was irresistible. I wanted to re-try the experiment from last year, to see if I could bump the score up a little bit.
From an earlier study in AWS, I know that, for running high performance DB applications, it is critical to have adequate number of IOPS provisioned for the storage. So if I were to re-run my study to drive a Million transactions per minute through a DE in AWS, it was going to cost a lot of money in Storage and (almost 70,000) Provisioned IOPS.
Doing more with less
Delphix is storage agnostic, but we need the underlying storage to have the capacity to sustain the aggregate load of all the Databases expected to be virtualized. We constantly invest in making Delphix more efficient in its usage of resources to handle instances when storage is less than satisfactory. Delphix caching technology is one such feature that allows Delphix to do more with less.
In last year's experiment, I used an all-flash array as the backend storage for Delphix. I wanted to illustrate Delphix's ability to exploit the full potential of underlying storage and sustain high IOPS. For the current experiment, I want to showcase our caching, to sustain high IOPS from VDBs even when the backend storage can only sustain a fraction of those IOPS. But naive caching is not enough. The aggregate working set of a number of large databases would be prohibitively expensive to cache in memory.
Another key technology employed by Delphix to make the cache efficient is In-Memory Virtualization - store a single copy of a block in cache even if there are multiple VDBs reading it. In an earlier blog post, I showed how In-Memory Virtualization technology helps save costs on the total IOPS required to support load from DB applications. There are two key benefits of this feature:
- When servicing load from the Delphix cache, I would need to purchase only a fraction of the IOPS needed for my Applications.
- Running IO out of the Delphix cache costs less Delphix CPU compared to running the same load out of the backend storage.
For this experiment I used an EC2 instance with 32 CPUs and restricted the Delphix cache to 64GB. I observed that 64GB is sufficient to handle 80% of the working set. I also wanted to keep a reasonable ratio of Delphix cache to DB size, so the results can translate to larger applications. This tradeoff allows me to sustain 80% of reads from the cache, letting the backend storage handle the rest.
Having migrated all my on-premise lab environments onto AWS, I had a 1TB OLTP DB ready to go. All I needed was to create 26 copies of that DB to drive load onto Delphix. If you thought cloning a 1TB Oracle database was difficult, try cloning 26 of them synced to the same point in time. Frankly, I would not know how to even begin to do that. Without Delphix, even in AWS, for each new DB, this task involves several steps:
- Clone a new EC2 Instance
- Create Elastic Block Storage (EBS) Volumes for the new instance: Attach EBS Volumes to instance; Pre-warm the EBS Volumes
- Clone the DB onto the new instance
- Apply logs on the new DB to sync to a particular point in time
Within a few hours of using Delphix on AWS, I was able to stand up 26 VDBs, all provisioned to a single point in time. The task was as simple as cloning an EC2 instance and provisioning a VDB onto it. It takes mere minutes per new VDB. With Delphix, I do not need to create more storage or perform any storage management, nor do I need any Oracle magic to apply logs and check the data is synced to the same point, etc. The video included with this blog illustrates the speed with which I was able to stand up all the environments and have them ready to run load.
Scaling Transaction Rate
Once I started driving load through all the 26 VDBs and tuned it, things got really interesting. As expected CPU utilization on Delphix was lower on this run compared to last year -80% of the load was serviced from the Delphix Cache. Since I had spare capacity on the Delphix Engine, I added 4 more VDBs. Even with 30 VDBs running OLTP load, around 30,000 transaction per minute, there is almost 50% spare capacity on Network and CPU on the Delphix Engine.
The bottleneck was the CPU on the Oracle nodes - a resource I can easily scale up in AWS. Having surpassed the performance from last years experiment, I looked at the costs of this setup. Last year we demonstrated that, for high performance OLTP applications that require a large number of IOPS, traditional spinning disk storage becomes prohibitively expensive. Delphix reduces the total storage footprint of the workload through thin-provisioning of the VDBs and makes it feasible for an all-flash array to service the workload. We demonstrated that this combination of Delphix and Flash Storage produces a 10x improvement in price/performance.
For the cloud installment of this study, I wanted to further reduce the cost and possibly increase the load. Taking advantage of the Delphix cache and In-Memory virtualization technologies, I was able to further reduce the need for IOPS by 7x. This resulted in a 15x improvement in price/performance compared to the traditional setup and 5x improvement compared to an on-premise setup using an all-flash array. Note that, even with this load, I had almost 50% spare capacity on the Delphix Engine.
With Delphix running in AWS, I got an opportunity to re-try a massive and complex experiment from last year. I was able to demonstrate that, organizations that rely on OLTP applications can achieve performance similar to their on-premise environments using Delphix in the Cloud.
But the benefits do not stop there. We also saw that, using Delphix Caching and In-Memory Virtualization technologies, organization can see a 15x improvement in price/performance for their OLTP installations in the Cloud. The key benefits come from reduced reliance on expensive SLAs from AWS, in this instance provisioned IOPS.