In this blog

Hybrid environments (cloud, virtual and physical assets) are significantly more common in businesses today. All assets, regardless of location, should be protected. Having a backup solution for your cloud assets is paramount to mitigating the effects of threats posed by bad actors.  A quality backup solution should offer:

  • Automation
  • Scalability
  • Intuitive management
  • Comprehensive workload support (cover as many assets as possible)
  • Ransomware protection

Leveraging our partnership with Veritas, we were able to test two of their cloud backup and recovery solutions, Alta Data Protection powered by Cloud Scale Technology and Alta Recovery Vault. The results of the testing performed demonstrated that both solutions provide flexibility, performance and security. 

Alta Data Protection (ADP) powered by Cloud Scale Technology Overview

Cloud Scale Technology is Veritas' answer to protecting cloud assets at scale. Veritas has made Cloud Scale available in the most popular cloud providers Amazon Web Services (AWS) and Azure. For the purpose of our testing, we used Amazon Web Services (AWS) as the cloud provider. Therefore, I will be talking about the AWS configurations in this article. Cloud Scale Technology has changed the way NetBackup is deployed by leveraging AWS' Elastic Kubernetes Service (EKS) as its platform. By adopting containerization and employing Kubernetes orchestration, ADP can utilize microservices for deployment within Elastic Compute Cloud (EC2) Instances and perform varying functions within the backup and restore processes.

By using Kubernetes as the platform technology, ADP powered by Cloud Scale Technology provides benefits such as:

  • Scalability
  • Automation
  • Self-healing
  • Performance
  • Cost savings

Cloud Scale Technology deploys more nodes as demand increases and removes the nodes as demand recedes, both are done automatically. This reduces costs as this elasticity removes the necessity of building traditional servers big enough to handle their rarely used peak usage.  Additionally, being able to spin up instances when needed helps mitigate the performance bottleneck that is traditionally experienced with physical or virtual systems. Rather than having to monitor backup/restore times closely, Cloud Scale automatically provides the necessary resources to maintain a high level of performance when demand fluctuates. 

Alta Data Protection utilizes Kubernetes built-in self-healing functionality to automatically detect node, pod failures and will automatically try to repair or restart the failed pod or node.  In the case of a failure that impacts a backup/restore job, ADP switches to another healthy pod to perform the work. This minimizes the downtime that would be experienced should a system resource fail. 

Alta Data Protection Deployment

AWS Networking and Permissions

* Networking and IAM roles are the most detailed steps in the Cloud Scale deployment. 

For the deployment, we created an AWS subscription to host the VPC which would contain all of the resources for the Cloud Scale deployment.  Before we kicked off the deployment, we validated the following necessary prerequisites: 

  • EKS Admin-level access
  • Identity Access Management (IAM) Role with required permissions for the installation host
  • Route 53 DNS setup

AWS Networking

To start the deployment, we needed to create three internal subnets in the VPC to allow communication between EC2 instances.  Required subnets: 

  1. /22 for EKS Cluster
  2. /26 for Load-Balancer
  3. /28 for EKS installation requirement located in a different Availability Zone (AZ) from the /22

When creating the EKS-install subnet, make sure it's in a different availability zone than the EKS-cluster subnet (/22) as required by AWS. 

IP Addresses may vary from environment to environment

Next, we created two private hosted zones within Route 53.  We associated them to our VPC.  One zone was for forward lookups while the other was for reverse lookups.  These zones will serve as the DNS lookups within AWS. When creating the forward lookup zone in the Load Balancer subnet, we added records for the following systems:

  • Primary Server
  • Media Server Deduplication Pool (MSDP)
  • Media Server
  • Snapshot Manager

It is possible to have multiples of MSDP and Media servers in a customer's environment.  However, for the tests we performed, we opted for one of each.  Amazon reserves the first four and the last IP addresses in each network by default. We kept this in mind when we configured the IP addresses of the systems within the A record.  Also, any A records created should be within the IP address range of the load balancer subnet (/26).  In the example below, we selected an IP of 172.31.8.6 as the IP for the primary server.

We repeated the same steps for the other systems.

In the Reverse lookup zone, we created the PTR records.  These will serve as the IP to hostname entries facilitating IP DNS lookups.

We repeated the same steps for the rest of the systems.

IAM Roles

Next, we needed to create an IAM role to be used to deploy and manage the cluster.  This IAM role will be attached to the installation host. For testing, we named the role Veritas_TIS_EKS_to_EC2 and attached it to the installation host.  Once created, we assigned the following Amazon built-in policies:

  • AmazonEKSWorkerNodePolicy
  • AmazonEC2ContainerRegistryFullAccess
  • AmazonEKS_CNI_Policy
  • AmazonEKSServicePolicy
  • AmazonEKSClusterPolicy
  • AmazonEKSVPCResourceController

 

Additionally, we had to create custom-managed permission policies for additional permissions.  We engaged Veritas specialists to create the JSONs for import as policies (AmazonEKS_EFS_CSI_Driver_Policy, MSDP-C_WORM, NetBackup_Cloud_Scale_Additional_EC2_EFS_EKS_IAM_Permissions, and SnapshotManager).  We would recommend checking the Veritas Support Portal to get the required JSONs for the policies.  They are required to perform necessary functions within Cloud Scale.  The permission policies should look like this when assigned to the IAM role. 

The next step was to create an instance profile and then attach it to the IAM role.  This IAM role was created for the EC2 Service so it could be attached to the installation host.  When creating the IAM Role for the EC2 service, it will automatically recreate the instance profile.  If you use the AWS command line, you will need to create the instance profile.

Installation Host

Once the AWS prerequisites are completed and validated, the rest will be done via Terraform scripts. To do this, we needed to create a Terraform Host with the following:

  • Ubuntu 22.04 on t2.large with 100GB disk attached
  • SSH key pair to facilitate SSH communication.
  • Attached IAM Role with assigned permissions (that was previously created in the earlier previous step)
  • Deployed in the EKS cluster subnet (/22)

After launching, we SSH'd into the host to install some required programs to manage the Alta Data Protection environment.  Installed programs included (consult Veritas documentation for proper versions):

  • Terraform
  • Docker
  • Kubectl
  • Helm
  • AWS CLI

Additionally, we made sure to update the Terraform host packages so that the whole system was up to date using sudo apt update && sudo apt upgrade -y.

Terraform Files and Deployment

With the host running in AWS, we needed to download the tar file for the deployment.  We logged into the Veritas Support Portal, searched for Veritas 10.2 Alta Data Protection powered by Cloud Scale Technology and found the file VRTSk8s-netbackup-10.2-0065.tar (version may vary depending on when reading this article).  We downloaded it locally and then used the SCP command to upload the file to the Terraform Host.  For simplicity purposes, we decided to create a cloudscale directory in the /home/ubuntu location for all the files to be hosted. 

We engaged a Veritas specialist to get the required zip file for ADP, aws_cloudscale_10.2_v2.zip. This file contains all of the terraform scripts that we need to deploy the environment.  We moved the downloaded zip file into the cloudscale directory and unzipped it.  This allowed us to cd in the aws/base and aws/deployment directories to adjust the terraform variable files.  In both directories, we copied the sample.tfvars files and named them cloudscale-aws-base-10-2.tfvars and cloudscale-aws-deployment-10-2.tfvars respectively.  Each file contained specific variables such as IP addresses (the IPs that were set in the AWS networking deployment steps), IDs, usernames, passwords, etc.  For this exercise, we set node_group_scaling settings to one or two so that we could showcase the scalability at a later step in the testing.

Once the tfvars files are complete, we needed to cd back into the base directory so that we can initialize terraform for deployment.  The commands we ran were in this order:

  • Terraform init
  • Terraform plan --var-file cloudscale-aws-base-10-2.tfvars
  • Terraform apply --var-file cloudscale-aws-base-10-2.tfvars

Each step performs specific actions: 

  • Terraform init – pulls down the right modules, provider, and initializes backend.
  • Terraform plan – performs a dry run of the deployment and reports back what would have been provisioned if you had run an "apply" command.
  • Terraform apply – provisions the resources.

Since there were no errors, we ran navigated into the aws/addons directory and ran the following:

  • Terraform init
  • Terraform plan
  • Terraform apply

The terraform plan and apply commands here do not require a tfvars file.  With no errors, we proceeded to change the present working directory to the aws/deployment directory.  We performed the same steps as done in aws/base except we replaces the name of the file to cloudscale-aws-deployment-10-2.tfvars.

We were able to run kubectl get pods -A to see the resources being created

Once the cldscale-wwt1-primary-0 POD is in Running state, you can now access the cluster from a system within the same subnet.

Accessing the NetBackup WebUI

We created and launched a Windows host within the eks-cluster subnet so that we could access the GUI of the Cloud Scale deployment.  We opened a web browser and navigated to the primary server webui address. For this example, we used the name of the primary server's FQDN.

Once we got to the login page, we entered in the credentials that were specified in the cloudscale-aws-deployment-10-2.tfvars file.

Once logged in, we were able to see the Cloud Scale dashboard which looks exactly like an On-Prem Veritas Flex Appliance.  The familiarity allowed us to navigate the GUI much faster than if it were a new GUI layout.

Overall, the deployment didn't take very long.  The longest part was in the AWS/ADP Deployment section as this is where the docker images are downloaded.  This step was completed in about an hour.  The rest of the deployment steps were completed very quickly. 

Cloud testing

EC2 Backup

Veritas Alta Data Protection provides organizations with a more cloud-friendly approach to protecting workloads hosted in the cloud.  Customers can backup cloud resources using on-prem systems.  However, due to cloud service provider design, customers will receive a data egress charge.  To avoid these unnecessary fees and to have a more performant data protection schema, you should deploy ADP in your cloud account/subscription. 

Another great feature of Alta Data Protection is that because it was built with the cloud in mind, upon installation and authentication it will perform auto-discovery of the existing workloads and resources. This makes the identification of EC2 instances, applications, and/or PaaS databases for protection more seamless. Intelligent Policies can be created to utilize tagging to automatically add new workloads to Protection Plans (more to come on this feature further down).    

We chose to backup the terraform host since it was already deployed and running.  We created a Protection Plan which is how Veritas defines the backup frequency, storage target, and retention for each system.  After applying the Plan to the host, the initial backup automatically kicked off without intervention.  The process was very straightforward and the automated steps made it much easier to configure. The initial backup completed in a few minutes and the data did not egress AWS which minimized charges to just running the systems for Cloud Scale

EC2 Restores

EC2 Instance Rollback

After backup completion, we had a restore point to proceed with recovery testing.  The first, and simplest, test was an EC2 Instance Rollback. All we did was navigate to the recovery point, choose Instance Rollback, and 1 minute later, the EC2 instance was recovered.  A 100GB EC2 instance was recovered in less time than it took to backup the instance itself.  This process took a total of 8 clicks after GUI login.

EC2 Intelligent Policies

Earlier in the article, it was alluded to automatic protection of cloud resources.  To test this, we leveraged the Veritas functionality of Intelligent Policies. What Veritas did was design a way to apply a Protection Plan to resources automatically when these new workloads meet certain criteria.  Some of the extensive criteria is shown below.  For testing purposes, we decided on using the EKS cluster name by selecting the tag:eks:cluster-name.

We chose tag:eks:cluster-name because multiple systems shared that tag. When we chose to apply the Intelligent Policy to the resources, it showed multiple applications rather than a single instance.

We created an Intelligent Group so that we can provide the necessary details for Protection Plan application. A major benefit to Intelligent Groups is that you can configure multiple groups for a specific AWS account, or you can configure multiple Intelligent Groups across multiple AWS Accounts.  The logic that we were able to see within the configuration is very similar to most other conditional logics.  This familiarity allowed us to configure conditions much faster. Although not included in this article, we were able to test some compounded conditions for adding protection.

After specifying the conditions, we selected the Cloud Protection Plan that was previously configured and then enabled the Intelligent Group.  Prior to the configuration, this is what the EC2 resources looked like:

Within a minute of configuring the Intelligent Policies, the EC2 instances were automatically protected by the Cloud Protection Plan.  Backups began automatically without any manual intervention. This feature makes a Cloud admin's job much easier saving time on applying protection to all of the systems within an account.  It also reduces the risk of workloads not being backed up by being overlooked or missed among the potentially hundreds of VMs in an account. Or, if you have new workloads that are created by other groups in your organization, the use of tags can ensure that those new EC2 instances are protected as well. 

EC2 Restore and Restore to a new EC2 Instance

To restore the EC2 instance directly, we followed a procedure similar to the Instance Rollback. The only difference is we are choosing Restore Virtual Machine.  The rest of the steps are straightforward for the restore.  To address a Test/Dev scenario, Veritas Cloud Scale offers the ability to restore an EC2 instance to a new system.  We followed the same procedure for restoration with the exception that we renamed the EC2 Instance.  We appended RESTORE-TEST to the end of the name to differentiate the new instance in an obvious manner.

One of the benefits of restores with Veritas is that they perform a pre-recovery check to ensure that the operation will be successful.  This made it easier during testing to identify any potential issues we could face when performing the restore.

The restore to the new EC2 Instance completed in just 44 seconds.  This is a 20 second improvement from the Instance Rollback.

Not only was the process very simple, the full restore performed better than the other types of restores.  In addition to a full EC2 instance restore, Veritas Cloud Scale offers the ability to restore specific disks/volumes.  This feature is a benefit because you can restore to an existing machine AND you do not have to restore the entire system to access a subset of the data.  We were able to restore a 100GB volume to an existing machine in 35 seconds.

AWS RDS Backup

Alta Data Protection also addresses backing up cloud Platform-as-a-Service (PaaS).  For our testing, we backed up an AWS RDS database that we created in the AWS account.  The database was also auto discovered when the AWS account was connected to our ADP environment.  ADP uses a different methodology to protect PaaS databases like RDS DBs. To accomplish this, we had to create a NetBackup Universal Share that will be used to store and deduplicate the data. There were some additional IAM Role permissions that needed to be granted to ADP to access the necessary resources to protect the RDS instance. Consult Veritas documentation for more details regarding necessary permissions/credentials for protecting DBPaaS workloads. Once that was completed, the backup steps are the same for the RDS database as with an EC2 instance.  The RDS database backup was completed in 5 minutes and 13 seconds.

AWS RDS Restore

The restore procedure was just as simple.  The difference is that we restored to a new database to demonstrate a Test/Dev scenario. We selected the Test Database that was created and added the restore_ prefix parameter to differentiate the new database.

Similar to what was seen in the previous tests, the restore took much less time than the backup did. The database restore completed in 1 minute and 51 seconds.

We verified the restored database by using SQL Management Studio and searching for the restore_Test_Database instance.

Cloud Backup at Scale

To demonstrate how Alta Data Protection adjusts resources based on demand, we created 15 additional EC2 instances so that they can be backed up concurrently.  This simulates a higher demand similar to when backups start after normal business hours.

The new systems were automatically discovered after being created so importing them into Cloud Scale was not needed.  We triggered all 15 backups at the same time and monitored the POD count via the cli.

Figure 1: Pre-job Resources

During the backup we saw the ADP automatically create more data mover resources to account for the increased demand:

Figure 2: Just in Time Resources During Backup

As the backup job completed and demand subsided, the ADP cluster automatically scaled down and shutdown idle compute resources. Comparing Figures 1, 2, and 3, you will see less resources during the ebb and more resources automatically being deployed during the higher flow. This is the power of Alta Data Protection Powered by Cloud Scale Technology. The power and elasticity to meet your SLA's and reduce overall cloud compute cost overhead without the need for manual intervention. 

Figure 3: Post job resources

Alta Recovery Vault 

Veritas now offers a Storage-as-a-Service called Alta Recovery Vault (ARV). ARV is a managed by Veritas storage solution that presents a logically air-gapped object storage target for your backups. ARV is an MSDP-C and Image Sharing supported storage target for both on-premises and in-cloud NetBackup deployments.  

The combination of Alta Recovery Vault and Alta Data Protection provides key values such as: 

  • MSDP-C Support
  • Powerful deduplication engine reduces your overall storage footprint.
  • Accelerator technology that reduces your time to complete thed backup.
  • Image Sharing for seamless disaster recovery that includes capabilities like image conversion to AWS EC2 and Azure VHD.
  • Immutable and Indelible by default
  • Air-Gapped storage for ransomware protection

As part of this proof of concept, we configured Alta Recovery Vault as a storage target.  

Alta Recovery Vault was designed to securely store data in the cloud.  One of the biggest benefits for Alta Recovery Vault is immutability, the inability to edit the data/retention when configured in COMPLIANCE mode.  For our testing purposes, we used the GOVERNANCE mode which only allows a specific user with access keys to adjust retention on the data.  Veritas recommended Compliance mode for production usage and governance mode for testing.

Since this is a Veritas managed solution, we engaged Veritas Support to get an access and secret key to configure the Vault as a storage target for the Alta Data Protection instance. We needed to configure a bucket and the retention for the data stored in the vault which was accomplished by running the following command:

./msdpcldutil create -b <bucket provided by Veritas> -v <Volume Name> --mode GOVERNANCE --min 1D --max 30D -l 2023-10-31 --storageclass GLACIER_IR

Afterwards, we added the Alta Recovery Vault as a Disk Pool and Storage Unit so that we could configure a Protection Plan that would write backups to the vault. 

Configuring the Protection Plan was the same as before except for selecting the Recovery Vault bucket as the Backup Storage.

Granular File Recovery

One additional feature we tested was the granular file recovery option.  By enabling this, we are were to browse the files and recover them individually without having to recover an entire VM/volume.  We ran a backup of an EC2 instance and saw that it completed in 18 minutes.  The backup was secure and we were able to begin our recovery tests.

To recover, we did a whole EC2 Instance recovery to compare the performance of the backup to the restore. In true Veritas fashion, the recovery completed faster than the initial backup.  The restore of an entire EC2 instance completed in 11 minutes as compared to the 18 minutes for the backup.  Both times are good when it comes to inter-cloud account data transfers.

Summary

Through our Cloud Solution testing we have identified Veritas Alta Data Protection powered by Cloud Scale Technology as an excellent solution for protection of both IaaS and PaaS workloads in AWS.  We demonstrated that Alta Data Protection has:

  • Automation– many steps did not require human intervention.
  • Elasticity – by automatically scaling out and in depending on workload demands.
  • Intuitive management – by employing Intelligent Policies, the Protection Plans were automatically protecting all the relevant and intended workloads.
  • Comprehensive workload support (cover as many assets as possible) – Alta Data Protection auto-discovered all assets in the AWS account and had backup options for both the EC2 instances and the PaaS databases.
  • Secure – By configuring Alta Recovery Vault, the backups were sent to an immutable, air-gapped storage target providing additional ransomware protection.

We evaluated the efficacy of both the Alta Data Protection and Alta Recovery Vault solutions by testing some of the most common workload scenarios in customer's cloud environments and can confidently recommend ADP to protect your cloud assets regardless of environment size.

Technologies