Networked storage system employing information lifecycle management in conjunction with a distributed global file system
First Claim
1. A networked storage system, comprising:
- a plurality of network-attached file system (FS) nodes each including a file system server and at least one storage system coupled to the file system server, each storage system including a storage controller having a respective processor and one or more data storage devices, the FS nodes being arranged into cost-performance groups ranging from a first cost-performance group of FS nodes providing high performance at high cost to a second cost-performance group of FS nodes providing low performance at low cost; and
a storage management system coupled to the FS nodes, the storage management system having a processor and being operative in conjunction with the FS nodes to;
define a distributed global file system across the FS nodes, the distributed global file system being divided into segments for storing data files and having a single namespace of segment identifiers for the segments as visible from host computers utilizing the networked storage system, the segment identifiers lacking any identification of the FS nodes;
monitor frequency of access of the data files and based on the monitoring assign the data files to respective information lifecycle management (ILM) classes, the ILM classes ranging from an active class for data files having a high and recent access frequency to an inactive class for data files having a low and less recent access frequency; and
assigning the FS nodes to store respective segments of the distributed global file system according to a mapping between the cost-performance characteristics of the FS nodes and the ILM classes of the data files contained in the segments, such that segments containing data files assigned to the active class are stored in the FS nodes of the first cost-performance group and segments containing data files assigned to the inactive class are stored in the FS nodes of the second cost-performance group,wherein the active and inactive ILM classes are major ILM classes and each includes a plurality of sub-classes, the sub-classes of the active class including active-high and active-low sub-classes distinguished by relatively higher and lower access frequencies, and the sub-classes of the inactive class including inactive-recent and inactive-older sub-classes distinguished by an amount of time that data files have been assigned to the respective sub-classes.
9 Assignments
0 Petitions
Accused Products
Abstract
A networked storage system includes network-attached file system (FS) nodes implementing a distributed global file system. Each FS node includes a file system server and at least one storage system including storage resources (such as disk drives) representing portions of a global file system storage space. The FS nodes are organized according to respective cost-performance characteristics of the storage systems, generally ranging from a high-cost, high-performance characteristic to a low-cost, low-performance characteristic. A storage management system performs information lifecycle management (ILM) which includes allocating the FS nodes for storing data according to a mapping between an ILM-based data classification scheme and the cost-performance characteristics of the storage devices, as well as dynamically managing the placement of data among the storage devices according to the ILM-based classification of the data. The storage management system also performs global file system management including allocating a set of identifiers (e.g. segment identifiers) of the distributed global file system among the FS nodes, and dynamically adjusting the allocation of the identifiers among the FS nodes in response to the addition and removal of whole storage systems and/or storage resources within each of the storage systems.
-
Citations
12 Claims
-
1. A networked storage system, comprising:
-
a plurality of network-attached file system (FS) nodes each including a file system server and at least one storage system coupled to the file system server, each storage system including a storage controller having a respective processor and one or more data storage devices, the FS nodes being arranged into cost-performance groups ranging from a first cost-performance group of FS nodes providing high performance at high cost to a second cost-performance group of FS nodes providing low performance at low cost; and a storage management system coupled to the FS nodes, the storage management system having a processor and being operative in conjunction with the FS nodes to; define a distributed global file system across the FS nodes, the distributed global file system being divided into segments for storing data files and having a single namespace of segment identifiers for the segments as visible from host computers utilizing the networked storage system, the segment identifiers lacking any identification of the FS nodes; monitor frequency of access of the data files and based on the monitoring assign the data files to respective information lifecycle management (ILM) classes, the ILM classes ranging from an active class for data files having a high and recent access frequency to an inactive class for data files having a low and less recent access frequency; and assigning the FS nodes to store respective segments of the distributed global file system according to a mapping between the cost-performance characteristics of the FS nodes and the ILM classes of the data files contained in the segments, such that segments containing data files assigned to the active class are stored in the FS nodes of the first cost-performance group and segments containing data files assigned to the inactive class are stored in the FS nodes of the second cost-performance group, wherein the active and inactive ILM classes are major ILM classes and each includes a plurality of sub-classes, the sub-classes of the active class including active-high and active-low sub-classes distinguished by relatively higher and lower access frequencies, and the sub-classes of the inactive class including inactive-recent and inactive-older sub-classes distinguished by an amount of time that data files have been assigned to the respective sub-classes. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method of operating a networked storage system including a plurality of network-attached file system (FS) nodes, each FS node including a file system server and at least one storage system coupled to the file system server, each storage system including a storage controller having a processor and one or more data storage devices, the method comprising:
-
arranging the FS nodes into cost-performance groups ranging from a first cost-performance group of FS nodes providing high performance at high cost to a second cost-performance group of FS nodes providing low performance at low cost; and in a storage management system in conjunction with the FS nodes, the storage management system having a processor; defining a distributed global file system across the FS nodes, the distributed global file system being divided into segments for storing data files and having a single namespace of segment identifiers for the segments as visible from host computers utilizing the networked storage system, the segment identifiers lacking any identification of the FS nodes; monitoring frequency of access of the data files and based on the monitoring assigning the data files to respective information lifecycle management (ILM) classes, the ILM classes ranging from an active-high class for data files having a high and recent frequency of access to an inactive-low class for data files having a low and less recent frequency of access; and assigning the FS nodes to store respective segments of the distributed global file system according to a mapping between the cost-performance characteristics of the FS nodes and the ILM classes of the data files contained in the segments, such that segments containing data files assigned to the active-high class are stored in the FS nodes of the first cost-performance group and segments containing data files assigned to the inactive-low class are stored in the FS nodes of the second cost-performance group, wherein the active and inactive ILM classes are major ILM classes and each includes a plurality of sub-classes, the sub-classes of the active class including active-high and active-low sub-classes distinguished by relatively higher and lower access frequencies, and the sub-classes of the inactive class including inactive-recent and inactive-older sub-classes distinguished by an amount of time that data files have been assigned to the respective sub-classes. - View Dependent Claims (9, 10, 11, 12)
-
Specification