This is part 3 in our multi-part series on Microsoft SQL server, backups, recoveries, copy data management. In parts number one and two we covered SQL backups and SQL restores, in detail and talked about a lot of considerations, different approaches, pros and cons, and performance impacts. If you haven’t watched those I highly recommend you do as we’re building on that based on that knowledge for this particular part of the series.
This part of the series is all about using the SQL copy data, using those copies you made with your backup application or snapshots (whichever method you chose) for purposes beyond restores. When it comes to beyond restore there is really three different things that people tend to do:
- use copies for a disaster recovery
- use copies for test data management
- use copies for some other production like purpose such as reporting or analytics.
Now when it comes to using your copy data for any of these, performance really matters and performance is probably the number one consideration that’s out there. And, if you think about it right now, whatever method you’re using, you’ve got some backups of your data. If you wanted to, you can simply do traditional restores where your backup software is going to restore your data all the way back to your SQL server in order to enable any of these particular use cases. You can do restores for disaster recovery, restores to enable test data management, restores to enable reporting and analytics off of a copy of the data.
You already know the impact of that approach – that really leaves mounts. And when it comes to mounts there’s a couple of different considerations:
- Is your copy data on your production storage array, and if it is– are you doing mounts from that production storage array. And when you do, you have a couple of really strong benefits.
- you have production like performance, right? It’s your production storage, production class of disk presenting to your SQL server; you’ve got that production performance.
- But at the same you’ve got a much higher level of risk because if you think about it most people don’t run their test databases on their production storage array side by side with their production data. And the reason for that is very straightforward. If something goes wrong with your test environment you don’t ever want it to impact your production environment. And by goes wrong, sometimes that is a performance impact, sometimes that is a capacity issue, sometimes it’s just a matter of change control, you don’t want to be doing snapshot type activities maybe re-provisioning on your production storage array during production hours but yet for tests and development you really need to sometimes do those activities during production hours. So, great performance somewhat higher risk can be mitigated depending on the storage array and the users, but that’s always there.
- And then finally you have the fact that using your storage array tends to be a very storage centric approach. If you’re using your storage array for snapshots as we’ve talked about in the past that also probably means that your storage layout is corresponding to your disk layout; it probably means that you have a storage administrator involved in all of those types of activities not necessarily a very usable or self-service type approach.
- On the other hand, if you have your data in your backup infrastructure and your backup infrastructure supports doing mounts, instead of actually copying the data to your target server it’s simply presenting the data to your target server. Now you have a completely different set of considerations:
- From a performance standpoint, the duration is very, very fast just like it is with the snapshots. Your performance of the actual iOS that is now very variable; some of the considerations we’ve talked about in the past. What is the tier of disk that have your data stored on? Is it duplicated or not? What is the backup application? Is it storing it in its own native format, and then emulating out to here therefore doing a translation layer as those iOPS take place.
- Is it is doing that what are your performance requirements? If you’re doing functional testing in your tests and dev environment well that’s probably not got a very high level of performance required. On the other hand, if you’re talking about a QA environment or staging environment those often times have to mirror production performance. So that now becomes very relevant. You also want to consider concurrency of the environment. Can your backup infrastructure support dozens or hundreds of concurrent mounts of backup data into a variety of different environments for you to use. All considerations to be looked at, questions to ask when you’re considering what you want to do here.
The other side of this is usability. Everything I’ve said so far for performance by the way equally applies to disaster recovery test and development analytics and reporting. From a usability standpoint, now you’re looking at oftentimes multiple products, right? For disaster recovery, you’re looking at a DR orchestration tool. For reporting and analytics maybe it’s something as simple as a scheduling tool that’s going to schedule doing hopefully not a traditional restore, but maybe scheduling, doing your mount operations from wherever you’re doing them from.
For test and development, things get a lot more interesting. Oftentimes you want self-service so that your users can create their own databases. Maybe it’s actually from one source feeding into all three of these and then refreshing on different schedules; this one needs to be refreshed to a different point in time when this user particularly wants it to.
Having the ability to do self-service really can be a game changer from an efficiency standpoint. Having your test and dev users say I need something and be able to get it without waiting for a backup administrator and a storage administrator and a server administrator that can be very impactful to your development lifecycle. You also want to make sure from a SQL test and development standpoint that those scheduled refreshes can take place that you can say, “I’ve got a database here, I don’t always want to provision a new one.” Sometimes I just want to refresh it.
Sometimes, I want to back it up; sometimes I need to have backups of my test environments. And not only backups of my test environments, but also I may need to branch off of them to create multiple test environments and each one of these might need their own backups. And with each of these backups you might say, “Hey, as a developer I’m about to run a test that’s going to be fairly destructive to the data in my copy. So I’m going to take a snapshot of it first.” I’m going to take a backup insert a bookmark whatever terminology you like; to use I’m going to run my test; it might corrupt my data or it might just make a bunch of changes that I want to rollback, but I don’t always want to refresh it back from the source or back from the original source.
So usability in self-service isn’t just can I get the database. Can I handle all of the things with it that I want to do with it? Can I create these grandchildren and branch and then pick a point in time and roll it back to that point in time all myself? Right, so those are the usability types of things that come up.
And then finally you’ve got multi-tenancy. In order to be self-service and efficient especially and primarily in the test development space you also need security. Can you handle things like data masking? Can you handle making sure that this development group can’t see the databases that are being used by this development group which can’t see the databases from this development group.
That’s a summary of your considerations when it comes to using your SQL copy data. Are you using it from your backup infrastructure? Are you using it from your primary storage?
Thanks for reading!
Testing on SQL data sets? Actifio can help improve the process. Access the solution brief to learn how