The following is an excerpt from the Data Driven 2020 session, Managing Data by the Droplet.
Good afternoon, and thank you for joining us here at Actifio’s Data Driven. My name’s Joe Hassell, and I’m here with Jeremy Nees. I’m the Regional Director for Actifio in Asia Pacific, and Jeremy is the Chief Product and Technology Officer for one of New Zealand’s premier technology partners with Actifio, The Instillery. Jeremy, thank you for joining us.
Thanks for having us along, and a pleasure to be joining you all it Data Driven this year.
We’re here today to talk about managing data by the droplet. It’s been a real big change in the industry from the homogenous spread of VMware, virtual machines, and the ease of managing that data to the disparate data environments of multi-cloud, hybrid cloud, all of the various pairs and multi-cloud services that are available today. And I really just want to get your thoughts on how you see enterprises changing, and what the market is doing to adapt and respond to that in terms of managing data at scale and across the landscape.
I think firstly, I’d start by saying I think we’ve been incredibly lucky with the amount of conformity we’ve ended up within the data center over the last couple of decades. So, we ended up with a large majority of workloads virtualized sitting on our own VMware platforms, and with actually pretty good supporting backup technology. VMware, VADP-first snapshots. Of course, it took a little bit of time to mature and we all understand some of the probably original pains of VMware consolidated back up, but it got pretty good. And we could go to a VM found… We could probably snapshot 90 to 95% of our workloads and be pretty happy that we hit this whole bucket of data that we were collecting that would meet our business requirements, meet our SLAs.
It’s some nice extra features, like if it understood the file system, you could restore and more granular level by file. That was very efficient using change block tracking, and we could indeed you put really nicely in the backend. So all these things made for a pretty idealistic data protection environment. And then, of course, you had a few weeks to workloads that were special sorts. The ones that you were either critical in terms of business data or had applications like really transactional databases on them that you wanted to connect or our agent to protect. And those might be your last 5% of your backups. So 95% of the job was done by plugging in your backup appliance, pointing it to VMware, putting in your credentials, and setting in a policy and go, job done. So I think that’s changed, though.
When you’ve got all of your workloads in one primary data center, it makes it very easy to DR to an offsite, and to know that your backups are off-sided and air-gapped, and all that good stuff. How’s that changing too.
Firstly, you had something to DR too. you could pick stuff up and you could put in another VMware environment or something similar if you could do some smart stuff around converting the machine type on the way. When we started to talk about today’s multi-cloud environment, and it’s actually not so much the fact that we’re multicloud and that we use AWS energy, or it’s more the fact that we probably use a whole combination of technologies. We use probably some on-premise stuff store for workloads that may be difficult to move for licensing purposes or because they’re not supported in public cloud. We’re using some public cloud. We’re using some SAS, so we’d likely got Office365 in the business. We’ve quite possibly got a SAS-based CRM that’s got really valuable business data in it. And then we’re probably starting to use microservices as well. So platform services on clouds, containers, a whole raft of different services.
And now we’ve got to think about how are we going to go up and pick up those droplets of data, and how are we going to protect them? Because the reality is our business requirements probably haven’t fundamentally changed about we need to protect our data from a number of events that could be catastrophic or impacting to the business. And we need to do that within a set of fairly well-defined criteria around how quickly we need to recover it and what we need to do with it. So those things have complicated life. You can’t pick up an Office365 backup and just restore that to another Office365, because there isn’t one. You don’t have some of those same options. This whole notion of protecting data by droplets just recognizes that we’ll have small portions of data spread in lots of places, and we really need a good platform approach to how we manage that, in particular manage the protection of that data going forward.
When you talk about a platform approach, I think about the idea of having a structure of business objectives of how we want to manage our data as the driving policy engine, if you will, for managing those data droplets. How do you see that interfacing with all of those different components? Because obviously the way that which you’re going to manage your Office365 data is quite different to the way you’re going to manage your databases or whatever it is.
From my perspective, I think that’s a fundamental architecture shift in what a data protection platform looks like. And it’s one that’s been built in the cloud era for the cloud era on the understanding that you will have data as droplets, as little segments on particular applications, particular clouds, particular on-premise workloads. And it recognizes that from the get go. And what I really mean by that is the inverse is what we’ve been used to, actually. We built backup technology, not data protection and data visualization platforms, and we’ve done that by going, “I need to protect a Windows server, and so I’m going to build something that understands Windows,” and that’s defined my platform. Then I’ve gone, “Well, VMware’s been invented, so I’m going to understand VMware, and I might want to understand some applications as well, and some pieces like that,” but it’s been very much around a tightly coupled integrations into a relatively narrow set of technologies.
And what we’re talking about is a fundamental shift away from that. in fact, you’re looking for loosely coupled integrations into lots of technologies, and actually lots more technologies that we don’t even know have come yet. So use some of that original functionality is still absolutely relevant. We still want to be able to protect database servers. We want to be able to protect VMs. But what’s probably more important now is that the platform I’ve got understands that in the future, a lot of its heavy lifting is going to be making an API call to a service that’s not tightly integrated, that doesn’t have access to the storage subsystem, or the operating system, or directly to the database layer. It’s going to make an API call. It’s going to ask something else to perform an action, so it’s more like an orchestration approach. Give me a copy of this data, back this up, or drop it to a bucket.
Sign up for blog updates via email.Subscribe