In this blog

In this post, we will look at who these players are, how we got here, where this integration is going and why it is important.

Who are the players?

A long-term leader in Gartner's Data Protection Magic Quadrant, Commvault is a software-based data protection solution.  In a nutshell, Commvault is well regarded for its broad client and hypervisor coverage – Commvault has an agent for almost any backup workload or hypervisor a customer might need.  It is also storage agnostic, supporting a wide variety of traditional block based, Network Attached Storage (NAS) and modern object (S3, blob etc.) storage targets.

Architecturally, Commvault has followed the traditional 3 tier backup model consisting of the control plane (CommServe), data movers (MediaAgents), and storage targets (tape, disk, NAS, object) architecture, similar to Veritas NetBackup or Dell Networker .

commcell_logical

Commvault has long advocated their own inline deduplication for backup, often requiring with some hefty hardware to run and manage the associated deduplication databases. Deduplication is siloed per MediaAgent and this can result less efficient deduplication.  In extreme cases a deduplication pool may become corrupted and need to be sealed and placed in a read only status with backup moving to a new pool, requiring a new deduplication database.  Overall, deduplication on a large Commvault environment is something that leverages a noticeable amount of hardware and is another component in the environment that customers need to manage.

Dell PowerProtect DD Series Appliances (PPDD) is the current incarnation of the industry-changing Data Domain (DD) protection storage appliance. Initially brought to market as a backup target only, the PPDD product line has evolved to integrate a large ecosystem of data protection products from multiple vendors.

The protection storage conversation tends to focus on deduplication efficiency because this is an easy-to-measure statistic and has a direct link to cost per terabyte, but DD did not become an industry leader solely on deduplication performance.

Dell Data Integrity Architecture (DIA) is Dell's name for the underlying resiliency architecture which focuses on ensuring the highest levels of data integrity and recoverability. Key features include:

  • End-to-End Data Verification
  • Fault Avoidance and Containment 
  • Continuous Fault Detection and Healing

Essentially the platform is designed to ensure that anything written successfully can be recovered.

Dell PPDD Stream-Informed Segment Layout (SISL) architecture optimizes throughput scalability while minimizing disk footprint. Key features include CPU-Centric Design that groups related segments to improve throughput on deduplicated writes and reads for rehydration. 

Dell DD Boost is the platform's "native" protocol.  Advantages of leveraging it include:

  • Client-side deduplication
  • Data encryption in flight 
  • Enhanced connection resiliency for backup streams 

Performing deduplication on the backup client reduces network traffic for the backup stream by 80% to 90% and uses less MediaAgent CPU than pushing a full backup through the I/O stack. Applications can integrate Dell DD Boost at the API level or use Dell DD BoostFS to provide a general file system interface to the DD Boost library. DD BoostFS presents an NFS mount point to an application, allowing direct access to DD Boost protocol efficiencies for backup and recovery.

What is changing?

Commvault customers have long been able to write to Dell PowerProtect DD. Initially, fully hydrated or Commvault deduplicated backups were written to PPDD via CIFS or NFS file shares opened on the PPDD. In addition to the security concerns that come with those protocols, customers were often paying for two separate deduplication technologies and shipping un-deduplicated backups across the network anyways.

Another alternative was to leverage PPDD's Virtual Tape Library technology and treat the PPDD as tape. This solution is limited to fibre channel SAN transport and introduces all of the complications of virtual tape management.

Later integration via DD BoostFS enabled greater efficiency but ran into some limitations around infrastructure OS choice, particularly a lack of support for SELinux (a Linux kernel security module that provides a mechanism for supporting access control security policies).

Commvault and Dell engineers worked together to include the DD Boost libraries, first in the Commvault MediaAgents and then in the client agents. This integration brings all the benefits of writing to PPDD in its native protocol and from the API level with the Commvault software.

Commvault provides two options for this integration:  

  1. At the MediaAgent
  2.  At the Client

 

DD BOOST on Media Agent

Who Should Use This?

  • Existing Commvault customers already using a PowerProtect DD can easily convert from NFS/SMB or Boost FS.  This does not require a net new full backup, subsequent backups will deduplicate against previous backups.

Features

  • Direct communication with PPDD via Boost SDK
    • Eliminates need to use mount points network shares thus reducing a potential place for bad actors to corrupt
  • Data is written in native Commvault format
    • CV deduplication and compression
  • Supports WORM (DD Retention Lock Governance and Compliance modes)
  • DD client-side encryption
  • SELinux enabled workflow

DD Boost on Client Agent

Who should use this?

  • New or existing Commvault customers who want the advantages of DD Boost on the Client.  Existing customers will re-baseline all data as the format has changed from Commvault to Dell PPDD native format

Features

  • Direct communication on client with PPDD using Boost SDK
    • Reduced load on MediaAgent and eliminates the need to use mount points or network shares thus increasing performance and reducing security vulnerabilities
  • Data is written in native Dell DD format
    • Client-side superior native DD deduplication through SDK
  • Supports WORM (DD Retention Lock Governance and Compliance modes)
  • DD client-side encryption
  • SELinux enabled workflow Optimized Synthetic full back-ups and aux copies
  • Native File Copy API used to move between PPDD

Why is this a Good Thing?

  • Integration between Commvault and PPDD delivers improvements in throughput and storage efficiency while driving out complexity
  • Simplified Architecture - Reduced x86 Media Agent servers. Approximately 75% reduction of compute (which can be virtualized)
  • Backup Performance Improvements - 4X to 5X Faster Backups in tests conducted by Dell
  • Storage Reduction - variable length deduplication (4-12K) has been seen in tests to reduce storage by 60%
  • Replication - Commvault Aux-copy now uses DD Managed file replication to optimize replication
  • Expansion - DD Smart Scale Support

Conclusion

Integration between Commvault and Dell PPDD is a development that the industry has been waiting for. Commvault enjoys wide industry appreciation and provides broad client support. The Dell PPDD Series appliances are the gold standard for purpose-built backup appliances. Adding the DD Boost code to the Commvault client and media agent is a win for customers and for both vendors. Backups on an integrated Commvault/PPDD will be faster, more efficient and more resilient in a smaller hardware footprint.

Technologies