When thinking about cloud infrastructure, I often worry about data security, complexity, and costs. Our on-premises and data center infrastructure has far more predictable usage costs, and the topology is much easier to manage. But that doesn’t mean that the cloud doesn’t offer us certain advantages.
For example, cloud infrastructure enables offsite disaster recovery, giving us geographically distributed and protected backups. Does your company have a disaster recovery strategy and plan in place? Has it been tested? If you’re backing your servers up to the same location and not using a cloud platform, you can’t use disaster recovery software effectively at all. (For more information, see our Guide to Disaster Recovery.)
Why Have Disaster Recovery?
I look at disaster recovery (DR) as a tool to achieve one key objective: business continuity. When I evaluate DR plans and solutions, I’m always thinking in the context of business continuity, always asking “how will this let us maintain business operations in the event of a disaster or outage?”
Perform a quick mental evaluation of your DR strategy right now. What would happen if your primary production workloads go down? How long would it take to restore functionality? Minutes? Hours? What if the physical location is compromised by a natural disaster like a flood or a fire? A potential nightmare scenario is some kind of hack or breach, in which case you can no longer trust any of your infrastructure and need to start over from scratch.
Organizations often don’t consider these types of scenarios—nor how to remediate them—until it’s far too late. In the context of business continuity, having an effective DR plan in place can mean the difference between survival and failure.
How to Get Started
Identifying DR as being critical to the survival of your business is step one. The next step is to plan and implement an actual DR strategy.
I like to start by identifying key risks to the business. A good high-level list includes:
- Hack or compromise
- Weather or “act of God”
- Transit provider outage (Internet backbone)
- General data loss or corruption
This list is by no means comprehensive, or excessively detailed, but that’s not the point. The goal is to start thinking about some example scenarios and to start planning a strategy and operational procedure to deal with them.
Once the high-level risks have been identified, the next step is to catalog the assets and data that are critical to your business continuity. Anything running a production workload, or that contains customer data, is an absolute necessity in a DR plan. What often gets overlooked is the secondary and tertiary dependencies. Need access to tools or documentation for restarting your core application correctly? That should also be included.
Disaster Recovery Software and Tools
I won’t be able to tell you how to pick the perfect disaster recovery solution for backing up and recovering servers — there’s just too much nuance and unique technical detail in each environment. What I can do is help provide a baseline and some basic criteria for choosing a tool.
Let’s start by looking at some of the well-known solutions available today.
For organizations that are fully deployed on VMware infrastructure, vSphere software provides built-in functionality to enable complete file-based backups of VMware virtual machines. Unfortunately, there is no granularity in terms of choosing specific applications or files; it’s an all-or-nothing operation. The other limitation is that vSphere and VMware will need to be present on the secondary infrastructure, even if you’re restoring to the cloud.
Veeam is a third-party data protection platform that provides users with a variety of options for backing up and restoring critical workloads. Unlike tools such as vSphere, Veeam offers much more freedom in the form of platform and workload agnosticism. Customers could, for instance, back up an Amazon EC2 instance and restore it to a Hyper-V virtual machine, or back up a VMware vSphere VM and restore it to Microsoft Azure.
Another data protection software platform is Veritas. Similar to Veeam, Veritas offers customers a more platform-agnostic approach to back up and restore operations, although historically they have generally focused more heavily on Microsoft-centric infrastructure and software. They’ve launched additional solutions and features that enable customers to back up and restore data from on-premises virtual servers to cloud workloads.
Acronis bills itself as “All-in-one Cyber Protection,” offering data protection and cybersecurity features on its platform. Like Veeam and Veritas, it also offers platform-agnostic backup and restore options. However, cross-platform is viewed as a “migration” per their documentation, so the RTO may not meet the operational SLA for a critical DR scenario.
SolarWindows Virtualization Manager
SolarWinds is a name that should be familiar to most IT and system administrators. Well known for their suite of monitoring and management products, their Virtualization Manager gives users the capability to monitor and manage their Hyper-V, VMWare, and Nutanix environments. It also provides the capability to manage VM snapshots for VMWare and Hyper-V for backup and DR usage.
The Vembu Backup and Diaster Recovery(BDR) suite is another tool that has a broader scope of applications, providing backup capabilities for not only VMWare and HyperV, but Microsoft Windows, AWS instances, Office 365, Google Workspaces, and more. This flexibility provides an “all-in-one” solution, giving administrators and infrastructure engineers the ability to rely on the same tool for on-premise and cloud systems.
Nakivo Backup & Replication provides customers with specific products, like VMWare and Hyper-V backup, but it also advertises holistic solutions for disaster recovery, such as “Site Recovery Orchestration” and “AWS Disaster Recovery”. Similar to other providers like Vembu, they offer SaaS workspace and workstation backup as well.
Iperius Backup focuses primarily on VMWare and Hyper-V hypervisors, although it does provide a variety of potential backup destinations, including cloud storage such as Google Drive or S3. While this might not enable restoring to a cloud workload, it does at least provide critical geographic distribution in the event of a disaster recovery scenario.
How to Implement Virtual Server Backups
In this section, I’ll lay out some options for the actual implementation of virtual server backups. The hypothetical infrastructure in this case will be a deployment of 500 virtual servers. I’ll also list some basic cost data, although this comes with the sizable caveat that it does not include other potential costs like supporting infrastructure, network ingress/egress, and other fees.
For a simple, relatively cheap solution, vSphere isn’t a bad choice. An organization can purchase a VMware vSphere Standard license for $1,268.00, giving them the ability to install the ESXi hypervisor on up to 2,000 hosts, which is more than enough to run 500 VMs and fail-over infrastructure.
Going all-in on a homogenous solution means administration and management is simple; system administrators can utilize a simple point-and-click interface to perform a full backup and restore, as well as snapshots of VMs. Unfortunately, this homogeneity limits flexibility in choosing other workload destinations.
For a medium-priced solution that offers more flexibility in workload management, Veeam is a favorite choice for backing up virtual workloads. While their pricing is generally quote-based, their pricing calculator indicates a price of about ~$40k for managing 500 VMs. As noted in the earlier section, users have several choices for their backup and restore destinations, enabling cross-platform usage. Veeam also offers several restore options, including the scripted removal of sensitive data and snapshots.
Organizations that have large deployments, and are fully invested in something like SolarWinds could utilize the SolarWinds Virtualization Manager. While exact pricing details require environment specific quotes, the license for the VMAN starts at $1749, and large SolarWinds deployments can easily eclipse $100,000 USD.
Traditional Infrastructure Can Still Utilize the Cloud
Just because an organization might still depend on traditional, on-premises infrastructure and virtualization technology doesn’t mean they can’t take advantage of the cloud for DR. The geographically disparate, highly available nature of most cloud platforms make them ideal as a backup and restore destination.
The first steps are to identify and categorize the risks. Next, identify what needs to be backed up and what is critical to maintaining business continuity. I would advise any organization to be particularly careful and detailed here; it’s amazing what turns out to be a critical dependency when production goes down.
For organizations that just aren’t sure where to get started, or don’t have the resources to develop a comprehensive disaster recovery strategy, a partner organization with expertise and a proven platform for disaster recovery is going to be your best bet.
Faddom’s application dependency mapping platform creates a complete map of hybrid IT infrastructures — both on premise and in the cloud — in as little as one hour. Start a free trial today!