How to Backup VMware VMs Directly to Object Storage, Without On-Premises Storage

backup VMware VMs directly to object storage, without on-premises storage

I don’t have to be psychic to know what you’re thinking. After reading the title of this blog you’re probably wondering “Why would you send your backups to object storage, which might be located in the cloud or at a remote site, without keeping a local copy for local recoveries?”  That’s a great question, and I’ll answer it after providing a little background information first.

Of all the hypervisors, there is no question that VMware is the most mature and feature rich. This is especially evident when discussing VM backups. Among the many features provided by VMware to aid in backups, the ability to track and share the list of blocks that have changed between two points in time (known as CBT, or Change Block Tracking) is probably the most impactful. This feature has enabled nearly every backup vendor to implement incremental forever backups of VMware VMs.

While most VMware backups leverage the same set of underlying APIs from VMware, the different vendors have very different ways of moving and storing the data, and even more variations on how to use the backup data for recovery. Here at Actifio, we’ve found it to be a natural fit between our design principle of storing data in its native format and the CBT feature provided by VMware. This combination has enabled far more than just incremental forever backups, it also results in our ability to store every point-in-time backup as a virtual full. Having every backup image look like a full means we can perform near-instant mount recoveries for dozens or hundreds of VMs simultaneously, without emulation layers that introduce overhead and high storage latency.

But wait – this blog is supposed to be about sending backups to object storage. If you’re familiar with Actifio, or if you’ve read my blog post on high performance mounts from object storage (How to Mount from Object Storage with High Performance), then you already know that we’ve extended our incremental forever backup storage methodology to object storage. Data in object storage still looks just like the original source data, in its original native format. For VMware, this means we have disk images (that come from VMDK files), each of which are broken up into small blocks and sent to objects in object storage.

When we take backups to local disk and then copy that data to object storage, we use our own CBT capabilities to extend the incremental forever backups into object storage. This results in highly scalable backups with offsite storage for any retention duration you wish. Because we maintain the native format of the data, recoveries are also easy, with near-instant mount being available directly from object storage. This provides the best possible RTO, and is surprisingly performant (explained in much greater detail in the blog post referenced above). The recovery performance is so good, with such low RTO, that it makes us ask the question “why store data on local disk at all?”  And that’s why we’ve introduced the ability to skip that step.

As a bonus, our approach to incremental forever means expiration of data from object storage is fast and easy, without the need for long-running garbage collection processes or scheduled cleanup jobs. When a backup is expired, whether it is the oldest, newest, or somewhere in the middle, the process is fast and the data required exclusively by that backup is reclaimed immediately.

So how can we get the data from VMware to the object store without landing on local disk first? That’s actually the easy part. VMware has APIs to let us retrieve any block of data, in any size, from a snapshot of the source VM. Actifio is already taking a disk image, breaking it into blocks, and storing the blocks in object storage. To skip the local storage, we can simply request the exact block of data that will correspond to an object we want to create, and retrieve it via API. There’s no need to store it on local disk anywhere, we retrieve it from VMware and send it directly to the object store. We do this for each changed block in the source VM, adjusting for the block size we want. Once all changed blocks have been sent into objects, we store a metadata map of all the objects needed for that point in time, and this metadata map becomes our “backup” image. The map will reference objects created from earlier backups, if the corresponding blocks in the VM have not changed, and it will reference all of the new objects created.

Who should use this feature?  Most customers who have object storage as part of their backup design will want to use this feature in some capacity.  The latency between the Actifio appliance(s) and the object store will determine which VMs should have their backups sent directly to the object store, and which should also be stored on local disk for some retention period. When there is high latency, VMs that have very high I/O requirements might be better served by recovery from local disk.  But for most VMs, or for all if the latency to object storage is very low, this option provides a tremendous reduction in data movement, greater scalability per Actifio appliance, and significant cost savings.

And for those of you looking at Actifio primarily for DR, this is a great option to have Actifio coexist with a backup solution, sending data to the cloud for cloud-based DR while also minimizing onsite infrastructure and costs.


Free webinar: How to reduce costs by up to 55% with cloud backup and disaster recovery

Recent Posts