AWS Snapshots: The Good, Bad, and Ugly

AWS EBS Snapshots

The Good

Amazon EBS Snapshot is a native AWS service that helps you take snapshots of EBS volumes.

A typical on-premises storage snapshot stores data on the same storage array. Thus, if the storage array goes down, it exposes you to the risk of losing the source data AND the snapshots. However, EBS snapshots overcome this data loss scenario by storing snapshots of EBS volumes in AWS S3, a different storage tier.

The first EBS snapshot performs a full copy from EBS to S3. All further snapshots are efficient incremental forever snapshots, copying just the changed blocks from EBS to S3.

You can mount a point-in-time EBS snapshot to an AWS EC2 instance “instantly.” But note that the data is copied from S3 to EBS in the background. How fast a set of blocks can be accessed depends on whether those blocks were already copied from S3 to EBS or not. Since data movement is involved, large databases from such mounted EBS snapshots can not be online until all the data is copied from S3 to EBS.

For DR purposes, you can configure to store EBS snapshots in a remote region as well.

By storing the incremental forever EBS snapshots in cheaper storage such as S3 it helps reduce the cloud storage costs. However it has limitations, especially for enterprise databases such as Oracle, MS SQL, MySQL, PostgreSQL, SAP HANA running in AWS EC2 instances.

The Bad

All DBAs want application consistent backups. Unfortunately EBS snapshots don’t offer application consistency for all databases in all OS platforms.

Thus, DBAs prefer application consistent backup dumps to EBS storage and configure EBS snapshots of those dumps to S3 (See Figure 1)

AWS Snapshots the good bad and ugly

Even for the limited platforms where EBS snapshots are application consistent, DBAs prefer the backup dumps because they give finer control, unlike EBS snapshots, to recover the entire database instance or an individual database. However, such an architecture introduces several challenges.

Firstly, it’s easy to run into situations where EBS snapshots are activated before the DB dumps to disk finish. It leads to false sense of security because the EBS snapshots report success but you won’t be able to recover.

The second challenge is the cloud infrastructure costs. Let me highlight this with an example.

Assume you need to protect 10 TBs of source database data  running in AWS EC2. EBS GP2 SSD costs $0.10/GB/mo and EBS snapshot costs $0.05/GB/mo. To minimize the storage costs, as shown in figure 3 below, assume just 1-week retention in EBS storage, and 30 day retention of EBS snapshots in S3 storage.

AWS snapshots the good bad ugly
Figure 2

This architecture introduces the following cloud cost components:

  1. AWS EBS Storage: to store database backup dumps
  2. AWS Snapshot: to store EBS snapshots in S3 local region
  3. AWS Snapshot: to store EBS snapshots in S3 in remote AWS region
  4. Data Transfer Charges: to replicate snapshots from local to remote AWS region.

For 10 TB source data, Figure 4 shows the various costs. Total monthly recurring costs = $5,111 per month. Imagine the costs for a large 100+ TB environment in AWS !!!

AWS snapshots the good back ugly

Figure 3

The Ugly

The biggest challenge with EBS snapshots is that the recovery time (RTO) is very large because it involves copying the data twice in a multi-step step process (See figure 3)

AWS Snapshots good bad ugly
Figure 4

  1. After mounting an EBS snapshot, depending on how large the backup is, it could take many hours or even days to copy the data from S3 to EBS.
  2. Note that the data in EBS storage is in the backup dump format. So there is more wait time to feed these backup dumps to the recovery database.

 

And lastly, imagine an environment with dozens or 100s of database instances and 100s of snapshots to manage. With no “application centric” view and manual processes this becomes unmanageable over time.

An Ideal Solution?

Wouldn’t you agree that an ideal solution would do the following?

  • Deliver application consistent & incremental forever backups
  • Deliver instant recovery in minutes even for a 50+ TB database
  • Reduce cloud infrastructure costs by more than 70%
  • Reuse backups to provision thin-database clones instantly for Dev/QA/UAT testing
  • Deliver all of this with complete automation.

Download the white paper to learn how a copy data virtualization solution in AWS delivers low RTO, low RPO, and 74% cloud infrastructure cost savings.