SSD Performance at AWS S3 Costs

SSD Performance at AWS S3 Costs

AWS S3 offers 11 9s of durability, is theoretically infinite in capacity, delivers excellent throughput, is very inexpensive, and is perfect for cloud backup. On the other hand, AWS EBS flash storage delivers high IOPs but is 8x more expensive as compared to AWS S3 IAS. In a perfect world, you would want the economics of AWS S3 but the IOPs of AWS EBS SSD.

This blog discusses an architecture that can lower the Recovery Point Objective (RPO), Recovery Time Objective (RTO), Total Cost of Ownership (TCO), and deliver High Performance utilizing cloud object storage. While this article uses AWS as an example, this architecture can apply in Azure, GCP, and IBM clouds.

Low RPO with Cloud Backup to AWS S3

Many cloud backup solutions force a full backup, especially for databases, while storing backups in cloud object storage such as AWS S3, thus increasing the RPO. You can achieve low RPO as long as the backups are performed in an incremental forever manner for a wide variety of data sources, including mission-critical databases such as SAP HANA, Oracle, SQL Server, MySQL, PostgreSQL, Db2, etc. These incremental forever backups could be performed anywhere between every 1 hour to every 24 hours, and need to be stored in scalable and inexpensive storage such as AWS S3 IAS. For database backups, in between the incremental backups, archive or transaction logs also need to be backed up to AWS S3 IAS, with an RPO as low as 15 minutes.

Low RTO with Instant Mount from AWS S3

Some cloud backup solutions rehydrate the backup data from AWS S3 to AWS EBS volume after each backup, increasing the 24×7 costs as AWS EBS SSD flash is 8x more expensive. However, there is a better way to achieve low RTO in the cloud. For backups stored in cloud object storage such as AWS S3 / S3 IAS, you get low RTO with the following critical capabilities: 

  • The first is to mount a point-in-time backup image in AWS S3 as a virtual block device to an AWS EC2 instance, in just minutes irrespective of the data’s size, thus lowering the RTO. 
  • The second is the ability to store the backups in native VM or DB format so that when the data is mounted from AWS S3 to AWS EC2, the data is instantly accessible, and no further conversion is needed.
  • The third is deep integration with all enterprise databases so that the recovery process is fully automated without any scripts or manual intervention.
  • The fourth is an automatic conversion of system volumes of on-premises VMs or physical servers, or even Azure or GCP VMs to AWS EC2 instances.

With this approach, when an application reads a file from the virtual block device, the system intelligently figures out the AWS S3 object and the offset to read from. The writes can be written to an AWS EBS volume, thus ensuring that the backup image in AWS S3 is never altered and is immutable. 

Low TCO with High Performance using AWS S3 and AWS EBS SSD

The architecture described above uses inexpensive AWS S3 for both backup and DR, thus lowering the TCO.

But what about the situation where, after recovery with a mount from AWS S3, a database performs a lot of small IO transactions of 8 KB. If each such IO operation has to fetch the data from AWS S3, the latency would be high, and the database’s performance & the application would suffer.

To circumvent this problem, what if the data is cached in an EBS SSD flash storage? Thus, when a block is read, if it’s not in the SSD cache, the system will read the object from AWS S3 and cache it in the SSD for future and adjacent reads. In most apps, the same or adjacent blocks are read repeatedly, thus increasing the cache hit ratio. Here is a more detailed explanation on how to mount from object storage with high performance.

SSD Performance at AWS S3 Costs

With such an approach, a small 20% AWS EBS SSD could deliver 80% of the performance even though 100% of the data is in AWS S3.

AWS Elastic File System (EFS) life cycle management also uses a similar concept where it uses a combination of standard storage class and infrequently accessed storage class to reduce costs.

Actifio Copy Data Management for the Cloud

Actifio delivers all of the above critical capabilities to deliver low RPO, RTO, and TCO with excellent performance utilizing cloud object storage for low costs & on-demand SSD caching for high performance. It also offers 1-click DR orchestration in AWS, Azure, GCP, and IBM clouds.

Another article that explains how to accelerate slow object storage to near SSD speed can be found here.

ESG conducted a benchmark validation of this architecture for SAP HANA here and SQL Server here.

Click here for a deep dive conversation with a subject matter expert at Actifio or see a demo.

Recent Posts