Data Domain is showing its age. Time to bring data protection into the cloud era.

By Gaetan Castlelein • January 17, 2017

Data Domain was great in the 2000’s

Data Domain was a great product when it came out in the mid 2000’s. Back then, most IT shops had been backing up straight to tape with a Disk-to-Tape approach. As Data Domain bluntly put it back in the days: ‘Tape Sucks’. Or at the very least, tape sucks as a backup target. It’s slow, complex to manage, and can make recovery times incredibly long.

With Data Domain, people moved on from the old Disk-to-Tape model, to a much better Disk-to-Disk-to-Tape approach. In other words, backups are first stored on disk, before being archived to tape for long-term retention. Data Domain made great deduplication available for the masses and enabled folks to backup straight to disk at a reasonable cost. Backups became faster, easier to manage, and recovery times cut down significantly. This was a big improvement over the old model, and eventually led to a very successful Data Domain IPO in 2007 and acquisition by EMC in 2009. At this point, Data Domain has become ubiquitous across IT organizations and the gold standard for backup targets. (and as a side note … tape doesn’t always suck. It’s still a valuable and very common choice for long-term archival.)

But – Data Domain is a point solution with a legacy architecture

Data Domain, as good as it was back in the 2000s, is very much a point solution to a point problem. It’s a dedicated appliance that serves only one purpose: target storage for backups. That may have been enough 10 years ago, but today’s IT organizations have come to expect much more.

What has changed? Datacenter infrastructure is transforming now faster than it ever has before. Hyperscale companies like Google and Amazon have shown the world a better way to build infrastructure using distributed, scale-out architectures (as opposed to silo’ed appliances). These technologies are now making their way into the enterprise, and hyperconverged vendors like Nutanix and VMware with vSAN are showing us the value of converging multiple functions on these scale-out platforms. Finally, infrastructure is increasingly built using software-defined solutions running on commodity servers, as opposed to purpose-built appliances.

Data Domain offers none of these things:

  • It’s a point appliance vs. a distributed, scale-out platform. That means it requires overprovisioning, forklift upgrades with data migrations, provides only local deduplication, and is susceptible to hardware failures.
  • It doesn’t converge multiple functions. Its sole purpose is to be target storage for backups. It doesn’t fundamentally simplify backup infrastructure, which is still incredibly complex requiring media servers, master servers, cloud gateways, and storage. Each with their own UI, sizing requirements, etc.
  • It’s not software-defined. At least for the mid- and high-end, Data Domain can only be purchased as a dedicated stand-alone physical appliance. Sure, you can purchase Data Domain Virtual Edition, which is software, for small sites like remote offices or small businesses. But Data Domain VE won’t scale to datacenter scale.
  • It’s not natively cloud integrated. Data Domain was built before the emergence of the hyperscale clouds and doesn’t have native cloud integration. Over the past few years EMC has acquired Maginatics – a Cloud Storage Gateway vendor – and progressively made that functionality available for Data Domain. But it’s still a clunky solution that feels more like a bolt-on, and only offers limited functionality in the form of long-term archival.

Cohesity as Data Domain Alternative

Cohesity provides a web-scale platform designed from the ground up to store and manage all your secondary data, including backups, cloud data, test/dev copies, analytics data, files, and objects. That’s a very broad value proposition, and can seem a bit overwhelming at first glance. Most end-users don’t want to rip and replace all their existing secondary storage silos, but instead are looking for a simple, concrete way to get started with Cohesity.

The good news is that Cohesity can be adopted very non-disruptively, as a simple backup target. For example, to replace aging Data Domain appliances coming off maintenance, or to increase backup storage capacity. Cohesity presents itself as NFS storage to 3rd party backup products like NetBackup, Veeam, or Commvault, and interoperates seamlessly with these products.

  • Integrates seamlessly into existing infrastructure
  • Start by using Cohesity as target storage for existing backup SW
  • Over time – expand to converged data protection with DatProtect

Many customers choose to use Cohesity initially as a replacement for Data Domain. In a second phase, these customers often expand their use of Cohesity with the converged backup product, Cohesity DataProtect. DataProtect converges all the backup infrastructure, including master servers, media servers, and backup software, on the Cohesity DataPlatform. Adoption of DataProtect can happen on the customer’s own schedule – when the customer is ready for the transition. At that point Cohesity can be used simultaneously as a backup target (like Data Domain) for some applications, and as a scale-out converged data protection solution for a subset of apps.

Cohesity: Converged, Scale-out Simplicity…

For those customers choosing Cohesity as a Data Domain replacement, what’s the motivation? We’ve been getting a lot of feedback from customers and what they love about Cohesity has been pretty consistent:

  • It’s a distributed, scale-out platform. That means the platform can be right-sized for day 1 with no overprovisioning, and then scaled out over time as needs expand. Cohesity can be upgraded transparently with no impact to production systems (both hardware and software upgrades). It provides a single UI to manage 10s or 100s of nodes and Petabytes of data. It’s inherently highly available, spreading data across nodes and racks to ensure continued data availability even in the case of a complete node or rack failure. And deduplication is performed globally across all the nodes, maximizing space efficiency.
  • It provides converged data protection with DataProtect. Customers love the option to start using Cohesity with their existing backup software, and then over time migrate to converged data protection with DataProtect to simplify their backups. Customers like having the flexibility to transition to converged data protection on their own time. Once using DataProtect, customers get instantaneous Recovery Times and sub-minute Recovery Points. DataProtect also provides tight integration with VMware, SQL Server, Oracle, physical Windows and Linux, and Pure Storage arrays.
  • It’s software-defined. There is no dependence on custom hardware. And we provide many consumption options: physical appliances using industry-standard x86 nodes, software qualified for Cisco UCS, virtual appliance for smaller sites, and will be available to run in the public cloud (Amazon, Google and Microsoft).
  • It’s tightly integrated with the public cloud. Customers can leverage Amazon Web Services, Microsoft Azure, or Google Cloud for long-term archival, active data tiering, or replication. The integration is available as a feature of DataPlatform, with no bolt-on cloud gateways.
  • Enterprise-class features. Cohesity has been designed from the get-go with enterprise-class use cases in mind. For example, it includes software-based encryption with FIPS-certification, for both data at rest and in-flight. It provides built-in multitenancy for data and security isolation between users. It provides flexible geo-replication for off-site data protection and disaster recovery.
  • Making the data productive. Data Domain provides just an insurance policy against data loss. Cohesity is different. The data backed up to Cohesity can be put to productive use for things like test/dev workflows, or to run custom analytics queries directly on the platform.

…At Less Than Half the Cost

Cohesity does all this at a significantly lower price point than traditional dedupe appliances. Based on a study performed by the Evaluator Group, when Cohesity is used as a backup target, customers can reduce their Total Cost of Ownership by 55%. And when customers use Cohesity DataProtect for converged data protection, the cost savings can reach 80% or more.

This TCO is just for the hard vendor costs: software, hardware and 3 years of maintenance. It doesn’t factor in any of the soft benefits of simplified management, because those are more difficult to quantify and more variable (although very real and sizable!).


We’ve covered a lot of points in the sections above. The table below provides a summary of the key differences between Data Domain and Cohesity.

As a side note, if you think there might be a use case for Cohesity in your organization, we’re providing the Cohesity Commitment, allowing you to try Cohesity absolutely risk free and experience for yourself the power of the platform.

Thanks for reading.

Gaetan Castelein