Cloud providers like AWS, Azure, Google, and IBM offer great infrastructure and platforms to be leveraged by enterprises to increase speed and reduce costs. However, after discussing with many enterprises, I have concluded that when it comes to protecting cloud resident database workloads, they have adopted traditional practices as-is in the cloud which has increased their monthly cloud infrastructure costs instead of reducing them!!!
In the next few minutes, I will show the detailed breakdown of various cloud infrastructure components. But first, let me start with common practices of database protection.
In most enterprises, database administrators (DBAs) dump database backups to disk. They configure a backup product to sweep those dumps to tape or a dedup appliance.
DBAs like this process because it gives them control. They can configure when to backup, how often to backup, and more importantly control their destiny with self-service recoveries instead of relying on a backup admin to recover data from tapes or dedup appliances.
Inheriting Traditional Practices in AWS
When these enterprises move their production workloads to the cloud, say AWS, the DBAs continue with the same process. The default design (Figure 1) in the public cloud is to backup databases to AWS EBS storage and store EBS snapshots of those backups to AWS S3 storage.
The other approach could have been EBS snapshot of EBS disks storing production database. But DBA’s don’t like such a solution because it’s a crash consistent and not an application consistent backup. Moreover not having auto log truncation after a backup is a nightmare they don’t want to deal with.
Challenges with Database Dumps + EBS Snapshots
Using database dumps in combination with EBS snapshots creates three significant problems:
- Large Recovery Time (RTO): Recovering databases from database dumps leads to long restore times as illustrated in figure 2. The RTO is directly proportional to the amount of data restored from backups in EBS, and the RTO is even higher while recovering off EBS snapshots stored in S3.
- Large Data Provisioning Time for DevOps: Dev, QA, UAT, Security and Analytics teams need copies of production application data. Creating physical copies in these various test environments consumes a lot of EBS storage and time.
- Very High Cloud Infrastructure Costs: High AWS EBS Storage costs, EBS Snapshot costs at local and remote AWS regions, data transfer costs add up to very high cloud infrastructure costs. Analyzing these high costs will be the focus of the remainder of this article.
The High Costs
Assume you need data protection for one or more databases running in AWS EC2 with a total size of 10 TB, and a daily change rate of 5%.
EBS General purpose SSD storage is expensive at $0.10 per GB-month as compared to EBS snapshot costs at $0.05 per GB-month.
To minimize the EBS storage costs, assume just 1-week retention in EBS storage, and 30 day retention of EBS snapshots in S3 storage, as shown in figure 3.
I have seen many enterprises prefer full database backup dumps to disk instead of a combination of full and incremental backups because recovering from full and incremental backups increases the RTO as compared to recovering from a single point in time full backup image. But storing multiple full backups in EBS storage significantly raises the overall costs of EBS storage and EBS snapshot storage.
So let’s assume that they rationalize to minimize the cloud storage costs and decide that for their 7-day retention, they will store one full and six incremental backups in EBS storage. And after each backup, they will trigger an EBS snapshot of the EBS storage containing the backup dumps.
Note that EBS snapshot has a limitation that it stores backups in S3 in local AWS region where the EBS storage is present. For DR purposes enterprises want a copy at a remote AWS region. So as shown in Figure 3 above, they would have to configure a copy of the EBS snapshot to a remote AWS region as well.
Thus this Database Dump plus AWS EBS snapshot based architecture introduces the following cloud cost components:
- AWS EBS Storage: to hold database backup dumps
- AWS Snapshot (local region): to store a snapshot in S3
- AWS Snapshot (remote region): to store snapshot in S3 in remote AWS region
- Data Transfer Charges: to replicate snapshots from local to remote AWS region.
For our example of 10 TB environment and retention requirements, the following chart shows the various costs, with the total costs of $5,111 per month.
Following is the detailed analysis of the various costs.
EBS Storage Costs:
Storage for 1 Full backup = 1 x 10 TB = 10 TB
Storage for 1 Full backups with 1.6:1 compression = 10 TB/1.6 = 6.25 TB
Storage for 6 Incremental backups = 6 x 5% change x 10 TB = 3 TB
Storage for 6 Incremental backups with 1.6:1 compression = 3 TB/1.6 = 1.875 TB
Total EBS Storage = 8.125 TB
Assuming General purpose SSD storage at $0.10 per GB-month, total EBS costs = 8.125 TB * 1024 * $0.10 = $832 per month
EBS Snapshot Costs:
EBS snapshots copy only changed blocks in EBS disk.
Changes in EBS Storage per week (calculated above) = 8.125 TB
Thus, proportional changes in 30 days = =8.125 TB x (30/7) = 34.82 TB
Thus, EBS Snapshot Storage in 30 days = 34.82 TB
Assuming EBS snapshot costs of $0.05 per GB-month, EBS Snapshot costs = 34.82TB x 1024 x $0.05 = $1,783 per month
EBS Snapshot Costs (Remote Region):
Copying EBS Snapshot from local to remote region incurs the same EBS Snapshot costs in the remote region as well.
Thus, EBS Snapshot costs at remote region = $1,783 per month
Cross Region Replication Costs:
You incur data transfer charges because of replicating snapshots.
Total data changed in 30 days = total data replicated in 30 days = 34.82 TB
Assuming data transfer charges of $0.02 per GB, Data Transfer costs = 34.82TB x 1024 x $0.02
= $713 per month
Thus total cloud infrastructure costs to protect just 10 TB of databases = $5,111 per month. Imagine the costs for a large 100+ TB environment in AWS. These are not only very high costs but also provide a degraded RTO of multiple hours.
The Solution – Copy Data Management (CDM)
Is there a solution that is DBA and backup admin friendly, reduce RTO to minutes, RPO to 1 hour, give DBAs the control to backup and recover from any point in time, store backups from days to decades, and most importantly reduce cloud infrastructure costs by more than 70%?
Fortunately, a Copy Data Management (CDM) solution in the cloud can deliver all these capabilities, and your cloud costs would look like the chart below.
Just $1,313 per month instead of $5,111, a whopping 76% savings for a 10 TB environment. Imagine the savings with 100s of TB!!!
Download the white paper to learn how the Copy Data Management solution delivers low RTO, low RPO, and 74% cloud infrastructure cost savings.