After early customer access, we are proud to announce that Cohesity DataHawk is now generally available as of March 15, 2023. This is a significant milestone for organizations concerned about cyber resiliency as DataHawk brings AI/ML to data security and management to tackle the never-ending escalation of threats that challenge data recovery.
Is it safe? It’s a simple question posed to CISOs, CIOs, security and IT leaders on a continuous basis about their operations and data. The answer is probably a qualified one—it depends on the incident, threat, or circumstances and whether the organization has the adequate protections for their mission-critical data and processes.
IT faces a persistent cadence of threats that are growing in complexity and elusiveness and then implements countermeasures and tactics. A statement of the obvious, but this is why AI/ML have become a requirement for IT operations today as the rate and complexity of threats rise. We see AI/ML everywhere, from ChatGPT, to IBM Watson, to Netfilx’s recommendations, and Google search. And AI/ML is transforming security especially with SIEM/SOAR solutions, XDR, and threat intelligence. With the ingestion of vast amounts of data, baselines and outliers can be established to detect anomalies and unusual or suspicious activity.
Data security and management is no exception as the AI/ML journey began years ago with anomaly detection, scheduling, and optimizing. As modern day threats now threaten the ability of organizations to leverage their backup data for recovery, AI/ML has a critical role in data security and management: identify recovery threats and vulnerabilities and help the organization access the impact on sensitive data from an incident.
Backup data is foundational to data security and management. Based on criticality, organizations take snapshots of data in case they are needed to recover from ransomware, disasters, or other cyber incidents. These snapshots contain what was present in the production data—unfortunately, that may include elusive malware that may have evaded their cyber defenses.
Threat protection can be used to identify threats in these snapshots in two ways: proactively and when an incident occurs such as ransomware. Cohesity leverages AI/ML to detect user and data anomalies that could indicate an emerging attack, utilizing threat intelligence to ensure recovery-data is malware free. This automates the arduous and manual threat-hunting tactics that rely on security analysts to create YARA rules from various threat sources and feeds. This manual approach lacks scalability in depth and breadth. An organization can only search for a few rules across a few data sources.
When faced with a ransomware attack, organizations must scale to ensure that critical data is safe for recovery so that malware does not immediately reinfect the environment and create another crippling encryption event across data stores. Immediate and push-button execution of threat detection in backup snapshots is foundational to maintaining RPO/RTOs that support an organization’s SLAs and business objectives.
So what is required to effectively automate threat scanning for data security and management and why?
In the instance an attack has occurred, response teams have enumerable responsibilities. Critically, they need to assess what data exposure may have occurred. Data exposure has several implications for an organization. First, what customer and employee data could have been compromised? With that intelligence organizations can make informed decisions on privacy and regulatory responses that are needed, such as notifications and remedies to affected parties. Second, have trade secrets or other sensitive information been exposed and what legal consideration should be considered? Third, what operational data was leaked and how that may affect supply chains and partners? This is not an exhaustive list of considerations, but is provided to represent the various implications of data exposure and why organizations must have an accurate accounting of what data may have been impacted.
While organizations track sensitive data with many tools and processes (logical data models, enterprise architecture, data discovery and classification tools, data catalogs) they all have a central weakness. The weakness is simply shelf-life—what has changed since the last update of the tool(s) used to identify sensitive data? Given the massive rate of data growth and proliferation, it is safe to assume that there is some gap in what organizations know about their sensitive data. Certainly these tools and artifacts should be referenced, but the definitive conclusion about sensitive data exposure should be done immediately after an attack.
By examining the backup copies that were targeted in an attack, organizations can have the absolute latest intelligence to make the critical decisions enumerated above. Inferred in this approach is accuracy; the evaluation of data exposure should use the utmost precision to drive the appropriate responses.
So what is needed to drive a high degree of confidence that data exposure is accurately assessed and that the organization takes all appropriate measures?
As a recap of our DataHawk announcement, here are the critical capabilities organizations can use for their cyber resilience programs:
In addition to these ML/AI driven capabilities, DataHawk includes Cohesity’s award-winning data isolation service, Cohesity FortKnox, to provide fail-safe protection and meet the best practices advised by CISA and government regulators:
For more product information and demos, please visit www.cohesity.com/datahawk.