Protecting Cassandra databases in the cloud

Leading Tech Company Mirroring and Protecting Cassandra Databases in the Cloud

By Jay Desai • March 3, 2020

Customer first is the mindset at Cohesity. Here’s how we helped a leading provider of data-as-a-service capabilities for data-driven enterprise applications protect its critical customer data assets stored in Cassandra databases in the cloud and enable rapid application iteration. The award-winning software company manages all types of customer data including multi-domain master data, transaction and interaction data, as well as third-party, public and social data across all industries—from healthcare and life sciences to retail and entertainment.

Cassandra Backup and Recovery Challenge: Failing Scripts  

The organization standardized its Big Data environment on Datastax Enterprise (DSE) as the underlying NoSQL database. All databases and applications are hosted in the Amazon Web Services (AWS) cloud. The software company currently serves its clients using six 6-node DSE clusters, storing 36 terabytes of data.

Previously, the company was using its engineering resources to write scripts for protecting Cassandra databases. These backup scripts were executed on a nightly basis on each of the DSE clusters, yet they would frequently fail. Engineering would have to debug and fix these complex scripts so that Cassandra backups could be done successfully. Also, the scripts were backing up all replicas of data, resulting in escalating AWS storage bills.

Creating test and development environments with production data also was inefficient. Engineers waited days to get a non-production environment to use for development, thereby slowing down the application development process. These challenges were an unnecessary drain on valuable engineering resources and taking engineers away from other business-critical projects.

Solution: Cohesity in the Cloud Improves Cassandra Backup and Recovery Speed and Costs 

The company deployed a single, 3-node Cohesity cluster to back up all six of its DSE clusters. Deploying and configuring the Cohesity software clusters took less than an hour and the entire configuration was done using a web-based user interface. This greatly simplified the backup and recovery process and freed up valuable engineers from writing and maintaining scripts.

All backups now are deduplicated, encrypted, and stored in Amazon S3. Cohesity deduplication significantly reduced the backup storage requirements. Moreover, by copying backup data to low-cost Amazon S3, the organization was able to further reduce backup storage costs significantly. The same Cohesity cluster is being used to spin up test and development clusters using production data. With the Cohesity RESTful API, the tech leader is able to integrate Cohesity into its workflows and dashboard. Developers can now create test and development clusters easily and quickly—without writing any scripts.

Learn more about the Cohesity solution for Cassandra database backup and recovery.