Storage Accounting Framework is Cohesity’s answer to providing enterprise-grade transparency into data management with a next-generation user interface that presents fine-grain detail in a simpler way. In this post, we introduce new metrics and grouping that enable better chargeback, showback, capacity planning and forecasting. In addition, we explain global deduplication, its impact, and how to account for it in a shared storage environment. Finally, we discuss our approach to simplify storage computation for various use cases.
In our efforts to modernize and simplify data management, Cohesity introduces a new storage framework powered by a next-generation user interface that allows Cohesity DataPlatform to better report on storage utilization, reduction & resiliency.
Key benefits of the framework include:
The framework introduces new metrics and groupings that enhance operational insights and decision making.
For convenience and consistency, we have revised storage metrics. The new metrics are categorized as follows:
The table below provides details on the terms used in the new storage accounting framework. Understanding these terms can aid in getting to precise and comprehensive awareness of data and storage utilization in Cohesity solutions.
Category | Metrics | Metric Definition |
---|---|---|
DataProtect | Logical | Size of Primary object |
Data-in | Data sent from Primary to Cohesity DataPlatform | |
Data-written | Data written post reduction | |
Resiliency Impact | Space consumed by resiliency setting | |
Storage Available | Space available in cluster | |
Storage Consumed | Data written after honoring resiliency setting | |
NAS | Logical | Logical data in view |
Quota | Logical quota | |
Physical Data | Physical data stored (pre-resiliency) | |
Resiliency Impact | Space consumed by resiliency setting | |
Storage Consumed | NAS Physical data stored post applying resiliency setting | |
Ratios | Data Reduction | Space saved because of deduplication and compression. (Ratio of data-in to data-written) |
Storage Reduction | Overall change in data footprint between source data to post resiliency consumption (Ratio of logical data to storage consumed) |
The above metrics are now available at a fine-grain (e.g. for each backup task, a replication task, or a NAS share, etc.) as well as an aggregate level for the following four logical groups:
Deduplication eliminates redundant copies of data to reduce storage consumption on Cohesity cluster. It ensures that only unique instances of data are transferred over the network and retained on storage media.
When certain metrics are aggregated (at, for example, a protection task level, a NAS share level, or an organization level), the effects of deduplication must be considered. It is not obvious how to account for shared chunks of data. For example, if there is a chunk of data shared by both backup task #1 and backup task #2, should that data chunk’s “storage consumed” be attributed to task #1 or task #2?
Here’s our approach: the “storage consumed” (i.e. physical bytes of storage used) for a protection task is computed as if there were no other protection tasks in the system. That is, as if the task were in its own private dedupe domain. That’s a key insight. We use the same approach when calculating the “storage consumed” for an individual NAS share (i.e. a Cohesity View), or an organization. This approach has the benefit that such “storage consumed” numbers can be used directly for chargeback/showback.
Furthermore, because they measure the physical bytes of storage used, the metric is suggestive of the capacity able to be reclaimed if some elements of the protection task, view or org were to be deleted. In other words, this is useful for capacity management. Technically, for example, if a protection task were deleted, not all of its “storage consumed” would be freed up. Instead, the “unique” data chunks would be immediately freed, and the “shared” data chunks would have their reference counts decremented.
An important side-note when looking over these numbers:
As providing a best-in-class user experience is a two-way street, we diligently engage with our customers to understand their expectations. With the Storage Accounting Framework, we aim to give our customers more information and fine-grain insights, grouped in logical categories to address diverse real-world use cases.
Cohesity’s Karandeep Chawla and Yu-shen Ng from Product Management, and Sanjeev Desai from Solutions Marketing contributed to this blog.