News Article
Disaster Recovery at the touch of a button: desire or reality?
November 15, 2016
Numerous companies regularly schedule backups for 30 to 60 days. They use deduplication to reduce disk space and bandwidth requirements. They have a backup success rate of 98 percent and are able to perform instant recoveries, ie immediate restores. For most internal backup experts, a disaster recovery runbook is included in a spreadsheet. In the spreadsheet, information such as the order in which VMs must be restored, network configuration, firewall rules, DNS redirection requests, and so on.
Typical pattern shows weaknesses
In disaster recovery tests, which are performed more or less frequently, companies can often restore all VMs as desired. In most cases, however, considerable coordination is required between the DR, backup, storage, VMware, and network teams. It often takes more than 12 hours for the test to start.In this – much too long – time period, a huge, cross-functional project management is necessary to get the job done. The actual RTO goal (Recovery Time Objective) are so often not fulfilled.
According to observations by Actifio a pattern is clear in such tests, the visibly in many companies:
- Instant recovery is not really scalable. There is the illusion that a VM can be switched back on quickly. But in reality, this must be done in batches and using Storage vMotion.
- A lot of time is spent on manual sequential steps. The table calcula- tion determines which VMs must be restored in which order. For example, the database VM must first be restored, then a set of application server VMs, and then a set of web server VMs.
- Other manual steps, such as the assignment of IP addresses, VLAN port group, and DNS servers for each VM, need not only time, but are also fault-prone. Typically, an intensive troubleshooting should occur if some VMs are placed in the wrong network.
- It requires a considerable number of man-hours of many different teams to perform the DR tests. All this is stressful, error-prone, and an experience that everyone wants to forget and hope to never have to do it again.
To solve these deficits, the following would be required:
- Stress-free, automated, reliable disaster recovery at the touch of a button or a “1-click” disaster recovery solution.
- A solution that takes even a step further and independently performs scheduled, unattended, fully automatic restoration tests once a month or in a quarter.
- A solution that provides a compliance report that uses repeated DR tests per month or quarter. This strengthens the confidence that the productive system can actually be restored within a guaranteed timeframe, such as four hours.
What disaster recovery capabilities are required
To provide these results, you must select a solution that has the following functions:
- The ability to create DR plans on a Web interface rather than in a spreadsheet.
- To support multiple logical application groups for each DR plan. A logical application group is only a logical collection of VMs. Each application group can be restored independently and preferably simultaneously keep the RTO low.
- In each logical application group, the order in which VMs are to be restored must be specified. For each VM, the vCPU, vMemory, and network information such as IP addresses, DNS servers, and VLAN port groups must be specified.
- It should also provide the flexibility to set pre and post scripts before and after each VM or application group. This provides the ability to set external firewall rules or stop / start services within the VMs or any other customization. These scripts can also be used to perform data integrity checks in automated scheduled DR tests.
- Once these DR plans are defined and stored, the administrator must log on to the Web interface at the time of the test, or even in an emergency, and select one or more application groups to restore.Then it must specify whether it is a real DR case or a DR test and press a button. All disaster recovery should now be automated. This orchestration is used to orchestrate the recovery of VMs, the allocation of resources, the assignment of networks, and the calling of pre- and postscripts.
- Reports document the restored VMs, the user who initiated restoration, the time and the cause (if any).
- A roll-based access control allows administrators to create DR plans and allow application owners to perform DR tests in a self-service process.
Platforms for copy data virtualization, deliver the entire functionality described – simple and scalable for environments with 100, 300, 500, 1000 and more VMs. “1-click” -orchestrated disaster recovery can be met with this.
Such a platform for virtualization of data copies …
- Provides backups with flexible retention for days, weeks, months, years, or decades
- Enables scalable instant recovery
- Provides “1 click” orchestration with all requirements
- Supports a flexible RPO from 1 hour to 24 hours
- Maps the entire functionality on each memory so that a completely independent memory solution is available.
The days when five to ten people were involved in disaster recovery or DR testing are over.Scheduled, automated, unattended DR tests can be performed every month. These tests, which are carried out efficiently and automatically with the help of a platform for the virtualization of data copies, form the basis for the fact that the re-start is actually so fast that the business is minimally affected.