Time to Replace Aging Data Domain with Scale-Out Backup Target

By Raj Dutt • November 7, 2019

Is it 2000s? If not then why still use yesteryear’s solution?

Data Domain was a great product when it first came out in the early 2000s. Back then, most IT organizations had been backing up straight to tape with a disk-to-tape approach. As Data Domain bluntly put it back in the day: ‘Tape Sucks’. Or at the very least, tape sucked as a backup target. It’s slow, complex to manage, and can make recovery times incredibly long.

With Data Domain, people moved on from the old disk-to-tape model, to, at the time, a modern disk-to-disk-to-tape approach. In other words, backups were first stored on disk, before being archived to tape for long-term retention. Compared to the tapes, Data Domain offered data deduplication and enabled IT operators to backup straight to disk at a reasonable cost.

Backups became faster, easier to manage, and recovery times were cut down significantly. This was a big improvement over the old tape model. But – Data Domain was designed as a point solution that resulted in a highly fragmented backup infrastructure.
Data Domain served as a dedicated appliance and only had one purpose – target storage for backups – and only offered benefits:

  • When organizations operated in a simple environment
  • When organizations had limited workloads to protect data
  • When data did not grow exponentially and mostly stayed within the data center
  • Before the advent of public cloud
  • When organizations did not have to deal with sophisticated cyber threats, and growing compliance and regulatory requirements

Public Clouds Architecture vs. Purpose-Built Point Solutions

In the last decade, IT infrastructure has dramatically transformed due to the emergence of hyperscale companies like Google, Amazon and Facebook. These companies have demonstrated to the enterprise world a better way to build and operate IT infrastructure using distributed, scale-out architectures (as opposed to siloed appliances) on commodity hardware. Enterprises embarking on their digital transformation journeys are adopting hyperconverged scale-out platforms over purpose-built backup target like Data Domain. Why?

Point appliance vs. a distributed, scale-out platform
Data Domain requires overprovisioning, forklift upgrades with data migrations, provides only local deduplication, and is susceptible to hardware failures.

Doesn’t support multiple functions
Data Domain’s sole purpose is to be target storage for backups. It doesn’t fundamentally simplify backup infrastructure, which, even with Data Domain, is still incredibly complex requiring media servers, master servers, cloud gateways, and storage. Each with their own UI for management, sizing requirements, etc.

Not software-defined
At least for the mid- and high-end, Data Domain can only be purchased as a dedicated stand-alone physical appliance. Sure, you can purchase Data Domain Virtual Edition (a virtual machine) for small sites, like remote offices or small businesses. But Data Domain VE won’t scale to datacenter scale.

Bolt-on cloud gateway
Data Domain was built before the emergence of the public cloud and doesn’t have native cloud integration. In 2015, then EMC (now Dell EMC), acquired Maginatics – a cloud storage gateway vendor – and progressively made that functionality available for Data Domain. But it’s still a clunky solution that feels more like a bolt-on, and only offers limited functionality in the form of long-term archival.

Limited dedupe
Data Domain can’t dedupe across active Data Domain controllers or VEs. The disks within each must be uniform, so new technology cannot be pooled with existing technology. Unlike Cohesity, Data Domain’s implementation of advanced, sliding-window dedupe doesn’t work well with instant access. Avamar software reverts to fixed-block dedupe when used for VMs. The result: larger data footprint and extra copies.

Does not support non-disruptive upgrades
For updates and upgrades, Data Domain users need to schedule blackout windows and perform a forklift upgrade.

Modern Scale-out Backup Target Storage

Inspired by web-scale architecture, Cohesity DataPlatform is a software-defined scale-out backup solution that dramatically simplifies how data is stored. Unlike Data Domain, Cohesity offers the flexibility to migrate data to the public cloud without any dependency on cloud gateways.

Cohesity’s web-scale roots makes its adoption non-disruptive. Organizations looking to replace their aging Data Domain appliances coming off maintenance, or increase their backup storage capacity, can now deploy Cohesity DataPlatform on a variety of qualified hyperconverged appliances from Cisco, HPE, Dell and Cohesity. Cohesity presents itself as NFS storage to 3rd party backup products like NetBackup, Veeam, or Commvault, and interoperates seamlessly with these products.

  • Cohesity integrates seamlessly into existing infrastructure
  • Start by using Cohesity as target storage for existing backup software
  • Over time – expand to converged data protection with Cohesity DataProtect

So what is motivating customers to replace their legacy Data Domain with Cohesity?

  • Cohesity is a distributed, hyperconverged, scale-out platform. That means the platform can be right-sized for day one, with no overprovisioning, and then scaled out over time as needs expand. Cohesity can be upgraded transparently with no impact to production systems (both hardware and software upgrades). It provides a single UI to manage 10s or 100s of nodes and Petabytes of data. It’s inherently highly available, spreading data across nodes and racks to ensure continued data availability, even in the case of a complete node or rack failure. And deduplication is performed globally across all the nodes, maximizing space efficiency.
  • Cohesity DataPlatform is software-defined. There is no dependence on proprietary hardware. Customers deploy Cohesity on-premises on qualified x86 appliances from Cisco, HPE, Dell and Cohesity or in the public cloud, all managed through a single GUI.
  • Cohesity DataPlatform natively integrates with the public cloud. Customers can leverage Amazon Web Services, Microsoft Azure, or Google Cloud for long-term archival, active data tiering, or replication. The integration is available as a feature of DataPlatform, with no bolt-on cloud gateways.
  • Cohesity dramatically reduces your data footprint with global variable-length sliding window deduplication and compression.
  • Enterprise-class features. Cohesity has been designed from the get-go with enterprise-class use cases in mind. For example, it includes software-based encryption with FIPS-certification, for both data at rest and in-flight. It provides built-in multitenancy for data and security isolation between users. It provides flexible geo-replication for off-site data protection and disaster recovery. Combined with Cohesity DataProtect, compliance auditing and eDiscovery are made easy. Cohesity can scan backed up VMs for newly discovered vulnerabilities to hackers, allowing organizations to keep their production servers ahead of hackers and not accidentally restore a VM with vulnerabilities.
  • Cohesity provides converged data protection with Cohesity DataProtect. Customers love the option to start using Cohesity DataPlatform with their existing backup software, and then over time migrate to converged data protection with Cohesity DataProtect to simplify their backup and predictable recovery. Customers like having the flexibility to transition to converged data protection on their own time. Once using DataProtect, enterprises can protect a wide range of data sources, from leading hypervisors (VMware, Hyper-V, Nutanix AHV, RHeV), traditional and modern databases (Oracle, SQL, MongoDB, Cassandra, HBase, Hadoop), traditional and SaaS applications, unstructured data on NAS (NetApp, Isilon, Pure, and Cisco), physical, and more on a single software-defined solution. The converged solution also provides a comprehensive defence against ransomware attacks.
  • Making the data productive. Data Domain provides just an insurance policy against data loss. Cohesity is different. The data backed up to Cohesity can be put to productive use for things like dev/test workflows, or to run custom analytics queries directly on the platform.

…At Less Than Half the Cost

Cohesity does all this at a significantly lower price point than traditional dedupe appliances. Based on a study performed by the Evaluator Group, when Cohesity is used as a backup target, customers can reduce their Total Cost of Ownership by 55 percent. And when customers use Cohesity DataProtect for converged data protection, the cost savings can reach 80 percent or more.

This TCO is just for the hard vendor costs: software, hardware, and 3 years of maintenance. It doesn’t factor in any of the soft benefits of simplified management, because those are more difficult to quantify and more variable (although very real and sizable!).