Throughout my career with data center automation, I’ve run into a few unique challenges that I always will remember. For instance, we spend most of our time worrying about Day 0/Day 1 challenges so that sometimes we forget about future needs or very unique needs in the data center. I can count on one hand the amount of times I’ve been asked to shut down a data center. Very rarely do we think of needing to shut down a data center, mostly because we’ve been conditioned to ensure as much maximum uptime as possible. However, circumstances outside of our control may need us to have to shut down equipment. For that reason, it’s nice to keep a data center shutdown plan around and spend the time automating as much of those processes as possible.
For this first blog post, I’m going to show how I properly shut down both Cohesity clusters in the Mobile Executive Business Center (EBC). For those of you who visited the truck during its run, there was a set of automation scripts I worked on that brought the truck up and then down for an event day. This was all done with simplistic startup and shutdown PowerShell scripts delivered using PowerShell Core from either a MacOS device or a Linux VM in the infrastructure stack.
When you think about the usage of Cohesity in a data center, you know that the platform is going to be constantly polling the environment(s) and looking for new changes to back up. For that reason, we did not want to start shutting down any other infrastructure before we started to disable the Cohesity system first. Our first step was to disable the existing Cohesity Protection Jobs. For this, we used the Cohesity PowerShell module (available here).
To disable all active protection jobs for a particular cluster, you have to log into the cluster first, using the Connect-CohesityCluster cmdlet. From there, you simply have to enter the following:
Get-CohesityProtectionJob | Suspend-CohesityProtectionJob
For those needing more explanation of the PowerShell reasons behind this, the first cmdlet, Get-CohesityProtectionJob is going to gather all existing protection jobs on the system. Using the pipeline operator, we are going to funnel all of those jobs over the cmdlet Suspend-CohesityProtectionJob. The end result of this single line of PowerShell code is that all protection jobs on the cluster will now be in a disabled state. We need this, as when we start shutting down infrastructure, we do not want the Cohesity system trying to actively search or run a backup job for something that may not be in the right state to do so (for example, a database server that might be offline.)
You can add further enhancements to some of this code to check for the existence of running jobs and to cancel them. I’ll explain some of those components in the next blog post in this series.
As always, to find out more about all of the integrations that Cohesity is working on, please visit the Cohesity Developer site.