AWS Best Practices – Managing Cloud Resident Data in AWS

AWS best practices

6 months ago, Tom Dickson’s CIO asked him to migrate their enterprise applications to AWS. After lot of hard work and many trials and errors, Tom’s team managed to migrate to AWS. Tom was burned out and needed a break and so he took off to Hawaii for a week’s vacation.

He was enjoying the sun, the palm trees, the umbrella drinks…. until… the phone rang on day number four.  The CIO was on the line with bad news.

The DevOps team had called the CIO and were complaining that they were not getting copies of their Oracle database for testing in a timely manner.

To make matters worse, somebody mistakenly dropped a table in a MS SQL database which brought down a critical application. The DR team were relying on EBS snapshots stored in S3 for recovery, but the restoration times were far longer than expected for a 9TB MS-SQL database.

The security team was complaining to the CIO because both production and DR data copies are stored in the same AWS account.  The use of one account creates a significant security risk, and so they were demanding that DR data be moved to a separate account from production.

As if all of that was not enough, Jeff, an enterprise architect, sent an email which indicated that the data management costs were much higher than expected. He provided this summary:

  • Costs attributed to Test Data Management for DevOps
    • EBS storage to store copies of production data for DevOps. 5 copies of their 20 TB databases of MS SQL & Oracle
  • Costs attributed to Disaster Recovery
    • EBS snapshot costs
    • S3 storage to store snapshots
    • Replication to a remote AWS region
  • Costs attributed to backups and long term data retention
    • Beefy EC2 instances to run media servers. Because of dedup, they needed very large CPU and memory.
    • EBS storage to store backups for 4 weeks
    • S3 storage to store backups for monthly and yearly images

Even with all these expenses, the team was still stuck with a 24 hour RTO, and the DevOps team couldn’t access copies of data fast enough. It was a mess, and Tom had a serious problem. Tom had rushed to migrate to AWS and did not devote sufficient time to think through the data management in the cloud.

On his flight back from Hawaii, Tom created a data management requirements list and scheduled a meeting with his team to review their data management requirements in AWS.

In the meeting, Tom and his team came up with the following requirements.

  1. The solution must enable self-service database provisioning for DevOps purposes.  As part of that, it must also integrate with existing production environments such as Oracle, MS SQL, Oracle EBS, SAP HANA. The solution should reduce the storage costs in AWS. EBS costs are 5x to 10x more expensive than S3 IAS costs. Hence use different storage tiers for different testing.
    1. For performance testing, use EBS.
    2. For functionality testing, use S3 IAS.
  2. The solution must support the ability to store production and backups in two different AWS accounts. This ensures that even if someone manages to hack into the production account, they cannot get immediate access to the backups.
    1. Unfortunately EBS snapshots do not provide an effective multi-account strategy. Reason: EBS snapshot stores data in S3, and moving S3 data between accounts requires a full data copy which require significant storage and bandwidth costs
    2. Thus they needed a better option.
  3. The solution must allow the to backup their production applications in EC2, replicate them to a remote AWS region and perform a complete recovery at the remote location including the OS, application stack, and the data.
  4. The solution must allow the compliance team to instantly access data from older backups. The solution should leverage object storage to enable long-term retention while minimizing storage cost.
  5. The solution must deliver on all these capabilities and be simple to manage in a central way.

And so started the search for a solution that can satisfy all of the above best practices and requirements…

Are you planning to migrate your enterprise applications to any public cloud? Do you already have enterprise applications hosted in public cloud?

If yes, you also need to consider a solution that can satisfy all of the above requirements and best practices.

During an upcoming webinar, Jay Livens will review  how Actifio’s data virtualization platform can help you with these best practices and requirements in public cloud.

Register Now