Method and system for determining data profiles using block-based methodology
First Claim
1. A computer-implemented method for determining characteristics of data blocks stored in a storage system, the method comprising:
- receiving an input/output (IO) request for accessing a first data block of a first file stored in a storage system, the first file being one of a plurality of files stored in the storage system, each of the files containing a plurality of data blocks;
in response to the request, accessing, by a block-based monitor executed in a memory by a processor, a set of monitoring rules to determine whether the first file should be monitored, wherein the set of monitoring rules represents a set of monitoring parameters in a rule database;
in response to determining that the first file should be monitored, capturing, at a data block level by the block-based monitor, statistics data associated with the first data block, including determining at least in part a time of access of the first data block, a percentage of block change within a period of time, a level of block-based activities, and a changed block list (CBL) associated with the first data block as the statistics data, and capturing a timestamp of the first data block being accessed;
storing the statistics data of the first data block in a statistics database maintained in a persistent storage device, wherein the statistics database stores statistics data of a plurality of data blocks of a plurality of files monitored and captured based on the set of monitoring rules;
analyzing, by an analysis module executed by the processor, the statistics data stored in the statistics database, including determining accessing patterns of data blocks of the files at the data block level, to generate an analysis result; and
transmitting the analysis result to a remote analytics system over a network, wherein the remote analytics system analyzes analysis results of data blocks being accessed at a plurality of storage systems.
9 Assignments
0 Petitions
Accused Products
Abstract
Techniques for determining characteristics of data blocks being accessed in a storage system are described herein. According to one embodiment, an input/output (IO) request is received for accessing a first data block of a first file stored in a storage system. The first file is one of the files stored in the storage system and each file contains multiple data blocks. In response to the request, a block-based monitor executed in a memory by a processor accesses a set of monitoring rules to determine whether the first file should be monitored. If so, the block-based monitor captures statistics data associated with the first data block and stores the statistics data of the first data block in a statistics database maintained in a persistent storage device. The statistics database stores statistics data of the data blocks of files monitored and captured based on the set of monitoring rules.
-
Citations
24 Claims
-
1. A computer-implemented method for determining characteristics of data blocks stored in a storage system, the method comprising:
-
receiving an input/output (IO) request for accessing a first data block of a first file stored in a storage system, the first file being one of a plurality of files stored in the storage system, each of the files containing a plurality of data blocks; in response to the request, accessing, by a block-based monitor executed in a memory by a processor, a set of monitoring rules to determine whether the first file should be monitored, wherein the set of monitoring rules represents a set of monitoring parameters in a rule database; in response to determining that the first file should be monitored, capturing, at a data block level by the block-based monitor, statistics data associated with the first data block, including determining at least in part a time of access of the first data block, a percentage of block change within a period of time, a level of block-based activities, and a changed block list (CBL) associated with the first data block as the statistics data, and capturing a timestamp of the first data block being accessed; storing the statistics data of the first data block in a statistics database maintained in a persistent storage device, wherein the statistics database stores statistics data of a plurality of data blocks of a plurality of files monitored and captured based on the set of monitoring rules; analyzing, by an analysis module executed by the processor, the statistics data stored in the statistics database, including determining accessing patterns of data blocks of the files at the data block level, to generate an analysis result; and transmitting the analysis result to a remote analytics system over a network, wherein the remote analytics system analyzes analysis results of data blocks being accessed at a plurality of storage systems. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A non-transitory machine-readable medium having instructions stored therein, which when executed by a processor, cause the processor to perform operations of determining characteristics of data blocks stored in a storage system, the operations comprising:
-
receiving an input/output (IO) request for accessing a first data block of a first file stored in a storage system, the first file being one of a plurality of files stored in the storage system, each of the files containing a plurality of data blocks; in response to the request, accessing, by a block-based monitor executed in a memory by a processor, a set of monitoring rules to determine whether the first file should be monitored, wherein the set of monitoring rules represents a set of monitoring parameters in a rule database; in response to determining that the first file should be monitored, capturing, at a data block level by the block-based monitor, statistics data associated with the first data block, including determining at least in part a time of access of the first data block, a percentage of block change within a period of time, a level of block-based activities, and a changed block list (CBL) associated with the first data block as the statistics data, and capturing a timestamp of the first data block being accessed; storing the statistics data of the first data block in a statistics database maintained in a persistent storage device, wherein the statistics database stores statistics data of a plurality of data blocks of a plurality of files monitored and captured based on the set of monitoring rules; analyzing, by an analysis module executed by the processor, the statistics data stored in the statistics database including determining accessing patterns of data blocks of the files at the data block level, to generate an analysis result; and transmitting the analysis result to a remote analytics system over a network, wherein the remote analytics system analyzes analysis results of data blocks being accessed at a plurality of storage systems. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A storage system, comprising:
-
a processor and a memory; a storage device storing a plurality of files, each of the files containing a plurality of data blocks; and a block-based monitoring logic coupled to the processor and memory to receive an input/output (IO) request for accessing a first data block of a first file stored in the storage device, in response to the request, access a set of monitoring rules maintained in the storage device to determine whether the first file should be monitored, wherein the set of monitoring rules represents a set of monitoring parameters in a rule database, in response to determining that the first file should be monitored, capture statistics data at a data block level associated with the first data block, wherein capturing the statistics data includes determining at least in part a time of access of the first data block, a percentage of block change within a period of time, a level of block-based activities, and a changed block list (CBL) associated with the first data block as the statistics data, and capturing a timestamp of the first data block being accessed, store the statistics data of the first data block in a statistics database maintained in the storage device, wherein the statistics database stores statistics data of a plurality of data blocks of a plurality of files monitored and captured based on the set of monitoring rules, analyze the statistics data stored in the statistics database, wherein analyzing the statistics data stored in the statistics database includes determining accessing patterns of data blocks of the files at the data block level, to generate an analysis result, and transmit the analysis result to a remote analytics system over a network, wherein the remote analytics system analyzes analysis results of data blocks being accessed at a plurality of storage systems. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
-
Specification