Demonstrating Linear Scalability of Cohesity Data Platform
Many enterprise storage systems claim they scale out linearly, but in reality their performance reaches a cap as more nodes are added. As a result, businesses are hesitant to buy scale-out systems due to worries that as their enterprise scales, the system might not actually scale with it. Typically, the primary reason for degradation in performance is system bottlenecks that only show up at higher node counts.
Cohesity is a LIMITLESS scaleout system. It is built upon web-scale principles pioneered over the past decade at the world’s largest web companies. Our goal is to show that Cohesity has no performance bottlenecks as our cluster becomes ever larger.
Cohesity is a hyper-converged platform that consolidates all the secondary data and various associated workflows including backups, test/dev copies, files, objects and analytics. The IO patterns for these workflows fall broadly into two buckets: large sequential read/write and random read/write.
- Large sequential RW: Data protection and analytics exhibit this IO pattern. To test the scalability for these workflows, we use sequential inline dedup (IDD) read/write workload. The expectation is linear increase in throughput as the cluster size scales.
- Random RW: Test/dev and file/object workflows exhibit this IO pattern. We use IDD random read/write workload to test scalability for these uses. The expectation is linear increases in IOPS as the cluster size scales.
We simulated the workload using fio by writing four 2GB files per node, and we scaled our cluster from 8 nodes to 256 nodes. We used 1MB block size for sequential reads and writes and 4KB block size for random reads and writes. In addition, we ran our performance tests on Azure cloud using Cohesity Cloud Edition, that runs the same software as our on premise Cohesity C2000 Hyperconverged Nodes or Cisco UCS.
Top two charts demonstrate that for IDD sequential read/write workload, the relative throughput (MB/sec) increases linearly with cluster size.
The bottom two charts show relative scalability of random reads/writes. The addition of Cohesity nodes increases IOPS linearly. As a result more random read/write workloads (e.g. more test and dev VMs) can be run off of Cohesity as more nodes are added.
As demonstrated by these scalability tests, Cohesity offers limitless scalability in its distributed storage platform. What this means is that businesses can be rest assured that Cohesity can scale with their growing storage demands without compromising performance.