Why Strict Consistency is a Must-Have Capability for VMware Instant Restores
We recently published a blog on Strict vs Eventual Consistency, in which Cohesity Chief Architect Apurv Gupta expertly explains why Strict Consistency is a must-have capability: Users are guaranteed to always see the latest data, and data is protected as soon as it is written. Thanks to Strict Consistency, application availability/uptime and ‘no data loss’ are guaranteed, even if the infrastructure fails. (I highly recommend reading Apurv’s aforementioned blog as a primer to this post.)
Today I’m going to continue the discussion on Eventual and Strict Consistency. However, I will focus on what Cohesity’s Strict Consistency means for VMware vSphere and VMware Cloud Foundation infrastructures for data resiliency and high availability.
To set the stage, I will borrow from Apurv’s definition of each consistency model. I have also embedded two videos for an added visual explanation.
From there I will get to the meat of my blog: explaining what each model means to VMware vSphere and VMware Cloud Foundation environments.
Strict Consistency: For any incoming write operation, once a write is acknowledged to the client, the following holds true:
* The updated value is visible on read from any node.
* The update is protected from node failure with redundancy.
Eventual Consistency: Weakens the above conditions by adding the word “eventually” and adds the condition “provided there are no permanent failures”.
Logical Illustration and Workflow Animation of Eventual Consistency
What Each Model Means to VMware vSphere and VMware Cloud Foundation Environments
There are risks and problems that consistency models pose to any organization using traditional, or modern, data-protection and recovery solutions.
Unfortunately, there is a tremendous lack of awareness and understanding about this topic.
Traditional and modern data protection and recovery solutions are offered by many vendors, and provide the ability to quickly restore a VM or data with a feature often called Instant Recovery.
However, the restoration workflow and implementations are different, depending on vendor and product.
From a vSphere environment perspective, there are a series of recovery functions that are performed (manual or automatic) to restore the necessary VM.
Typically, the data protection and recovery solution – where a copy of the VM or data is stored – provides some form of a storage abstraction that is mounted onto vSphere. This is part of the reason the VM is recovered instantaneously. At this point, vSphere will provide the compute resources to run the VM if it’s necessary.
After the VM is recovered, a time comes when the VM has to be migrated to the primary storage platform where it formerly resided. In vSphere, Storage vMotion is used to migrate data over the network.
Now, while there are technologies and different features that make it possible to recover and instantiate a VM in minutes, those capabilities don’t exist when it comes to potentially moving hundreds of gigabytes across a network.
So, depending on the size and capacity being transferred across the network, the process can take a long time to complete. Time depends on network bandwidth, interface saturation, etc.
Eventual vs Strict Consistency: Compare and Contrast
The logical illustrations below describe the behavior and procedure of what happens in a recovery scenario similar to the one described above, and what impact the different consistency models would have.
1) Eventual Consistency
Impact of Data Protection and Recovery Solutions with Eventual Consistency in VMware vSphere And VMware Cloud Foundation Environments When a VM Needs to be Restored
1. Prepare and restore the VM locally onto a storage abstraction that is presented to vSphere in the form of an NFS volume. The abstraction is presented from a single node based on eventual consistency. 2. Automatically present and mount the storage abstraction to vSphere (NFS) from one of the nodes in the data protection and recovery cluster. VM is instantiated and accessible on vSphere. Read and write I/O are directed to the VM stored on the storage abstraction (NFS) presented from a single node. 3. New data being created is not protected, nor is it being distributed across the other nodes in the data protection and recovery cluster. 4. SvMotion starts the migration of the VM back to the primary storage platform – this can take a long time. 5. If the node in the data protection and recovery cluster (from where the storage abstraction (NFS) is being presented) to vSphere fails, the following happens:
● The storage abstraction (NFS) becomes inaccessible to vSphere
● The VM is no longer available or accessible
● SvMotion fails
● Any newly created data can be lost
Analysis: This is not an acceptable outcome when you depend on a data protection and recovery solution as your insurance policy. The result – depending on the magnitude of the failure – can put a company out of business, or at the very least, cost someone their job.
2) Strict Consistency
Impact of Cohesity With Strict Consistency in VMware vSphere and VMware Cloud Foundation Environments when a VM Needs to be Restored
1. Prepare and restore the VM locally onto a storage abstraction that is presented to vSphere in the form of an NFS volume. The abstraction is presented from the Cohesity cluster via virtual IP, enforcing the Strict Consistency model.
2. Automatically present and mount the storage abstraction to vSphere (NFS) from a virtual IP from the Cohesity cluster. The VM is instantiated and accessible on vSphere. Read and write I/O are directed to the VM stored on the storage abstraction (NFS) presented from the virtual IP of the Cohesity cluster.
3. New data being created is distributed and acknowledged across the other nodes in the Cohesity cluster.
4. SvMotion starts the migration of the VM back to the primary storage platform – this can take a long time.
5. If a node in the Cohesity cluster fails, the storage abstraction (NFS) being presented to vSphere remains available. The SvMotion will continue until completed because of the use of virtual IPs and Strict Consistency, which together mitigate the risk of data loss.
Analysis: The steps described above result in the outcome enterprises should expect, and demand, from their data protection and recovery solution when looking to leverage features such as Instant Recovery.
Cohesity → Strict Consistency → Positive Outcome
All of this is to show why Cohesity implements Strict Consistency: to mitigate the risks of data inaccessibility and potential data loss due to node failure.
Cohesity’s DataPlatform features, capabilities, and modern architecture were designed to meet the requirements of businesses today – not two-to-three decades ago.
I hope this information is useful and educational. Now watch the video below to see these concepts in action. Get some popcorn and sit back, this is a long one, but you’re going to love it!!