Data Protection Considerations when Migrating to OpenShift Virtualization
In this blog
Data Protection Considerations when Migrating to OpenShift Virtualization
Introduction
As the adoption of containers and specifically Kubernetes grows, the migration of applications and computing functionality is increasing rapidly. Red Hat OpenShift is one of the fastest growing distributions for Kubernetes. A key reason organizations are adopting OpenShift Virtualization (OSv) is to integrate Virtual Machines (VMs) with containerized workloads.
At World Wide Technology, we're often asked if migrating from vSphere ESXi to OpenShift Virtualization makes technical and business sense. This article will help answer that question by exploring Data Protection considerations for OpenShift Virtualization.
Stateful applications
Just as with the earliest deployments of virtual machines, early Kubernetes deployments were largely limited to nonessential workloads. Kubernetes was originally designed for stateless applications, meaning that backup and recovery (data protection) was unnecessary. Being stateless meant failed or inadvertently deleted containers could simply be spun up at need.
With the platform's maturity, we now see very large, complex, stateful applications being deployed on Kubernetes. This requires data protection vendors to respond with methods to safely protect these deployments in the event of failures or malicious intent.
Methods to backup and restore the Kubernetes clusters vary widely, some vendors use open source Velero, others have developed deeper integration which allows nearly seamless migration between Kubernetes distributions.
OpenShift Virtualization
As stateful applications have been adopted on containers, the ability to encapsulate virtual machines inside containers has become commonplace.
To that end, Red Hat OpenShift Virtualization is one of the fastest growing virtualization platforms in the market today. The growth is likely fueled by organizations' desire to have a common platform for containers and virtualized workloads. Additionally, the November, 2023 acquisition of VMware by Broadcom has incited a significant portion of the VMware's customers to look at alternative virtualization platforms.
To promote migration from other virtualization platforms, Red Hat provides its Migration toolkit for virtualization to aid customers moving from VMware and several other hypervisors.
KubeVirt
KubeVirt is an open-source project that makes it possible to run, deploy, and manage virtual machines (VMs) on Kubernetes. According to Red Hat "KubeVirt is sponsored by the Cloud Native Computing Foundation (CNCF), and it is the open source foundation for Red Hat® OpenShift® Virtualization." The primary factor driving the adoption of KubeVirt is organizations that need to manage both VMs and containers efficiently in a single deployment.
KubeVirt was originally developed by Red Hat in the 2016-2017 timeframe and eventually became a CNCF project (see https://KubeVirt.io). Unlike open-source KubeVirt, users of OpenShift Virtualization enjoy the benefits of stable releases and product support to assist in troubleshooting.
OpenShift Virtualization Backups
As the adoption of OpenShift Virtualization becomes mainstream in data centers, essential functions such as backup and recovery of the virtual machines must be examined. This blog will take a brief look at the various methods available to protect VMs on OpenShift Virtualization. For a more in-depth examination of VM protection, whether for basic data protection, disaster recovery or cyber resilience you can engage the WWT Data Protection and Cyber Resilience team.
OpenShift API for Data Protection
Red Hat provides the OpenShift API for Data Protection (OADP) to enable minimal backup and restore functionality at the OpenShift container level. OADP allows users to create Custom Resources (CR) such as VMs to be backed up and restored on S3-compatible object storage.
OADP provides hooks into the backup CR to enable pre- and post- backup commands to be run on the VMs. This will allow for crash-consistent backups of VMs. For more detail, you can refer to this document from Red Hat.
Original equipment manufacturer (OEM) backup
The top data protection Original Equipment Manufacturers (OEMs), such as Cohesity/Veritas, Commvault, Dell, Rubrik and Veeam, have deep and extensive support for protecting VMs running in vSphere ESXi, Hyper-V and AHV; just to name a few.
Evaluating the features and functionality of these OEMs frequently becomes a large compare-and-contrast checklist of capabilities that constantly change as the OEMs continually advance their capabilities for protecting VMs on the above platforms.
Unfortunately, very few of the capabilities we have come to expect from a virtualization data protection product are available for VMs running in OpenShift Virtualization. The reality is that most OEMs protect OSv VMs by backing up the underlying containers. While this method provides good protection for the VMs, it lacks enhanced features and functionality that backup administrators enjoy with other VM deployments (such as file level recovery). Furthermore, capabilities to enable disaster recovery and cyber resiliency are nearly nonexistent, requiring extensive architecture work to achieve functionality that is routinely available with other VM platforms.
It is also worth acknowledging that the need to protect OSv VMs now requires backup administrators to have fundamental knowledge of containers and Kubernetes. While this may not be an obvious concern, there is a steep learning curve here that must be considered and factored into the deployment timeline for data protection tools.
The good news is that OEMs have recognized the need to protect OSv VMs and are investing resources in developing robust backup and recovery solutions. All the OEMs listed above have partnerships with Red Hat to help solve backup and recovery for OSv VMs.
Agent-based backup
With the enablement technology built into today's modern hypervisors and the explosive sprawl of virtual machines, backups are typically performed on virtual machine images (Image Based Backup). The entire VM is backed up as just a few files, which enables very fast backups and restores. The protected VM images themselves can be examined using a variety of methods to perform recovery of databases, applications or even single files and folders (more on that below) without the need to restore the entire VM.
Before virtual machines running on hypervisors running on physical hardware became standard, every server running on its own physical machine was protected using a backup agent. The backup agent is a host-based application responsible for interacting with the server and sending files, databases and configuration data to the backup software.
Today, backup agents have mostly been relegated to large physical servers or databases where backup administrators require more granular control over the backup and recovery process than what is available from image backups.
Backup agents are also used where there isn't any other reliable method to protect the server. This is particularly true for protecting OpenShift VMs today. Since most of the mainstream backup and recovery OEMs don't have mature products for OpenShift Virtualization, the absolute most reliable method for backup is still using backup agents.
Single file restore
Whenever our customers evaluate data protection vendors, we have deep discussions on how disaster recovery and cyber resiliency is accomplished at scale in the event of an outage or ransomware attack. That functionality is crucially important and needs to be thoroughly evaluated when selecting a data protection vendor.
However, from a backup administrators' perspective, the single most common day-to-day activity they perform is still single-file or folder level restores. While a single-file-restore is and should be routine, it's proven to be complicated and difficult for OpenShift VMs protected by most data protection vendors. Simply put, most OEMs don't provide that functionality today. Most have it on their roadmaps for late in calendar year 2025.
Focusing on this deficiency highlights the fact that selecting a data protection platform can have a dramatic effect on routine restore times and even employee satisfaction for the backup administrators responsible for its care and feeding. Emphasizing protection from a disaster or cyber event needs to be balanced with the day-to-day activities most end users need.
Conclusion
Red Hat's OpenShift Virtualization platform is a viable alternative to existing virtualization platforms such as vSphere ESXi, Hyper-V and (AHV) Acropolis. However, the lack of mature data protection tools for backup and recovery, disaster recovery, cyber resilience and even single file restore, should be carefully evaluated before it is used as a platform for mission critical workloads.
As adoption of OSv increases, the major data protection OEMs will be responding with tools and capabilities for protecting OSv VMs. Expect to see improved capabilities by the end of the calendar year 2025.