As a battle-hardened DBA there’s one thing that has been consistent in my career: Everyone wants their data and they want it now. Back in the early days of my career, creating a copy of a database was no small task (if you could even pull it off). Storage was very expensive and we would reduce the size of the data sets to squeeze what we could into what we had. The process to refresh was cumbersome and involved many different people across many different specializations. It usually took days, even a week or more in some cases.
Today, for test data we can use our traditional backups and restore full copies to our lower environments. We can use array snapshots, VM Snapshots or Cloud Snapshots. However, these methods can still require a number of steps and not always give us exactly what we need. This is due to their nature, they are not really designed for test data management, they are designed for disaster recovery. It’s like putting a round hole in a square peg, if you do enough sanding you can make it work but it sure is an inefficient way to get the job done!
Enter Actifio, the pioneers of copy data management. Actifio is like a DVR for your data. Once you discover what you want to capture, you set up an SLA (Service Level Agreement) and a Profile that describes where you want to store your source data and how often you want to capture it. You can use block storage arrays of your choice and you can also use object storage. object storage is inexpensive and expansive. All of the major cloud providers have object storage targets (AWS S3, Azure Blob, Google Nearline/Coldline, etc.) and you can also deploy object storage on-premises (Dell EMC ECS, IBM Cleversafe, etc.). Object storage is provisioned in Actifio and referred to as OnVault. You can send copies of things like databases directly to OnVault and you can define multiple OnVault targets. This is where things get interesting.
Let’s assume you are capturing a 30 TB production SQL Server database. Your Actifio deployment might look like this:
- Primary Datacenter Actifio CDX Physical Appliance – uses Fibre Channel to talk to SQL Server Host
- DR Datacenter Actifio CDX Physical Appliance – talks to Primary Datacenter over dedicated 10 Gig line
- Actifio Sky Virtual Appliance deployed in Azure East
- Actifio Sky Virtual Appliance deployed in Azure West
- Actifio Global Manager (AGM) Virtual Machine deployed in Primary Datacenter’s vSphere
Your SLA might look like this – As seen through the Actifio Global Manager (AGM) GUI:
- Snapshot the SQL database 1 time per day & keep 3 days worth – Also written to Dedup Pool and Dedup Pool in DR
- Every time a Snapshot is taken Mirror it to the DR Datacenter – Same retention as primary snapshot
- Replicate (via StreamSnap) the SQL database to the Sky Appliance in Azure East – Same retention as primary snapshot
- Copy Snapshot to OnVault storage in Azure East – Once a week & keep for 8 weeks
- Copy Snapshot to OnVault storage in Azure West – Once a week & keep for 8 weeks
AGM’s access screen allows you to visualize your SLA and the points in time and places that are available with which to access your data. In the above example your application’s access screen in AGM might look like this:
Each circle is a point in time where your data is available for mounting, restoring or cloning. The columns represent places where the data is available. The shaded area for the most recent snapshot on the left tells you that logs are available with which to roll forward or backward to a specific point in time. Each of these points are available for an instant mount.
In a very short period of time you, or your customer, can click a point in time and instant mount the data to the target of their choice. The data is presented from Actifio managed storage and therefore the blocks do not have to be copied. These mounts are in native format and can be read from and written to just like any other database in your environment. To your DBA it will look exactly like the database was physically there. It’s like having a shelf of tape backups that you can just pick a point in time, then click, and minutes later you can use the data.
If you have developers in disparate locations you can use multiple OnVault targets or Sky appliances closer to their physical region and have their mounts presented from the cloud data center (or your data center) closest to them. Not only do they have their own copy of a FULL database that they can read and write as they need to, but it will be faster and more efficient because of its proximity to the recipient and advantages gained from our caching engine. No additional storage is needed for these mounts other than any changes a developer might be making to it. Copies can be refreshed on a schedule or on demand and sensitive data marked or masked as needed. With the ability to stand up both physical and virtual appliances, the use of mirroring and the direct to OnVault that Actifio provides, your test data is ALWAYS available where it is needed and WHEN it is needed.
Actifio is built from the ground up with a fully published API. This means that integration with DevOps tools such as Jenkins/Ansible/Puppet/Chef/etc. is easy to implement. Imagine your developers using this tool to quickly run their tests and deploy an update to your customer-facing application. The tests are more meaningful because they are being executed against a FULL copy of your databases. Of course, we have a full role-based access control (RBAC) engine built in or you can use your own solution for security.