While all data is valuable, for most users, database data is critical. Why? Well, databases power the most critical enterprise applications and so extended outages can materially impact businesses topline and reputation. Thus, it is no surprise that when it comes to data backup, one of the first places that IT invests is databases, and fortunately, there are many solutions to help. Historically database protection revolves around creating full database backup copies, but is that the right strategy?
In the past, users were forced to choose between incremental (and/or transaction log backups) and full backups. The former strategy had the benefit of the smallest application impact, least bandwidth utilization and smallest storage footprint. However, an incremental strategy also delivered lengthened recovery times since a full restore requires access to the previous full backup and all subsequent incremental backups. Thus database access cannot be restored until multiple previous recovery jobs have completed which can take hours or even days. These lengthy recovery times are not acceptable for many businesses and so they choose to depend only on full backups which significantly shorten recovery times.
A full backup strategy trades restore performance for application impact and bandwidth and storage efficiency. As the name describes, a full backup requires a complete copy of a database and so if your database is 10TB in size, each full backup will move a full 10TB of data to your backup target. The benefit of this approach is that recovery requires the restoration of only one backup job versus the incremental strategy which depends on multiple. The downside of this approach is that, even if 50GB changes in a 10TB database, you end up backing up 10TB i.e. 200x more storage IO and processing, which impacts application performance. Given the criticality of production databases, it should not be surprising that most companies rely on full database backups because they prioritize recovery over everything else.
The Very Large Database Problem
When thinking about recovery, the two most important metrics are RTO or Recovery Time Objective (e.g. how long a recovery will take) and RPO or Recovery Point Objective (e.g. how frequently you create recovery points). Naturally, companies want to minimize RTO and improve RPO. The full and incremental strategies provide conflicting benefits — full backups minimize RTO due to their more efficient recovery process while incrementals can improve RPO given their reduced application impact and smaller size. What should a customer do if they want to optimize both?
An ideal solution would increase RPO flexibility by allowing for very efficient incremental forever backups, and at the same enhance RTO by minimizing recovery times i.e. eliminate the need to restore from 10s or 100s of incrementals. Surprisingly there are few solutions that offer this combination. Many products claim “incremental forever,” but most only offer those efficiencies for file-based backups and so do not address the database challenge.
So why does the file based incremental forever backup not work for databases? This is because a database typically has fewer very large files. When database rows are updated/deleted or new rows are created, those existing large files change. So an incremental forever solution will end up backing up those large files again which makes no sense. Thus the ideal solution would figure out which blocks changed within those large database files and just protect those changed blocks. This is called block level incremental backup.
Block-level incremental backups are very efficient backups, but what about recoveries? How would you recover a multi-TB database from a backup that contains all the changed blocks from many incremental backups? A solution that needs to recover a muti-TB database in minutes cannot afford to “copy” data from backup storage media to the recovery server. It needs to have the technology to present a disk from the backup server to a recovery server and recover the database instantly. Even, within this technology performance of the database post recovery will differ significantly depending on whether the data was stored in the native database format or heavily deduped.
When thinking about data protection for databases, the question of full vs incremental backups is a critical one especially since databases are only getting larger and more critical. When evaluating solutions you should consider the following:
- What is the vendor’s best practices recommendations for database backups?
- What RPOs can be expected from the offering?
- Do their customers perform full backups? Incrementals? Block level incrementals? At what frequency of each?
- If client deduplication is being used, what is the impact of that on the production databases?
- If deduplication is being offloaded to a backup server or a storage layer, what’s the RTO impact to restore multi-TB databases?
- How does the vendor enable database recovery?
- What is the fastest RTO that I can expect?
- Do I have to fully copy the database back to alternate storage before accessing the information?
- If so, if I have a 10TB (<– insert large number here) database, how long will it take to recover? What if it was 50+ TB?
- Do they offer any ability to instantly recover databases?
- Can I use array-based snapshots to assist in recovery?
- How long will my recovery take if my backup is stored on a long-term storage medium like cloud object storage?
- If I decide to use AWS, Azure or Google will I get the same SLAs relating to RTO, RPO and TCO in the cloud as on-premises?
As you ponder these questions, I encourage you to check out Actifio and our backup and recovery solutions.
Data Management Using Flash Array Snapshots