
Database cloning is the process by which you create a point-in-time copy of a production database. Database cloning is usually done for different purposes:
- During application development cycles, the DB clones can be tested using the current database structure and content
- For being used by data extraction and warehouse tools
- For recovering data that was mistakenly deleted.
There are different tools that various organizations use for doing database cloning for DevOps, SecOps or UAT testing. Some of them are: Jenkins, Puppet, ServiceNow, Chef etc.
Database cloning process in AWS is time consuming. If you have to create clones of your databases, you will have to create EBS snapshots of the EBS volumes on which the databases are running. The EBS snapshots are stored in S3.
When you mount the snapshot back to an EBS volume, the entire data is rehydrated from S3 to EBS. When you mount the snapshot to 5 different EC2 instances, you end up with 5 copies on EBS volumes.
So if you have a 10 TB database and you create 8 copies for DevOps, SecOps, UAT testing, these copies will occupy a total of 80 TB in EBS. That’s a lot of storage, which gets worse as you need more copies.
Not to forget that creating physical clones will take more time as well.
This means that developers and testers have to rely to backup admins to give them EBS snapshots for their testing/dev purposes. Having no self service option for the test/dev teams just elongates the whole DB cloning process.
The testers want to refresh their test copies with changes in production. Some organizations refresh daily, some weekly. Every time a refresh is needed, teams have to go through this entire process again, thus making it not only time consuming but also labor intensive.
So… what is an ideal solution that would help you avoid all of the above challenges? Here are some critical capabilities:
- “Reuse” the database backups to provision dozens of ‘virtual’ database clones that consume no extra storage.
- Your testers can provision these clones instantly and in a self-service manner.
- Your admin can specify data masking scripts and sensitive data is automatically masked so that the virtual DB clones always has the masked data.
- You can provision these virtual clones from the database backups stored in EBS or S3 storage.
To learn more critical capabilities and how it works, download this white paper.