Multi-node computer system component proactive monitoring and proactive repair
First Claim
Patent Images
1. A data storage system comprising:
- a plurality of metadata server machines each to store metadata for a plurality of files that are stored in the system;
a plurality of slice server machines to store slices of said files at locations, in the slice server machines, indicated by the metadata;
a packet switching interconnect to which the metadata and slice servers are communicatively coupled;
a distributed file system to be executed in the metadata and slice server machines, the file system to hide complexity of the data storage system from clients; and
software to monitor the aging and usage of hardware components of the data storage system and implement a proactive component replacement process which sends a notification to replace a component in response to determining that the component has aged, worn, or both, to a predetermined level that is selected as being dose enough to the component'"'"'s predicted end of life in the system so as to prevent failure of the component in the system.
6 Assignments
0 Petitions
Accused Products
Abstract
A highly available multi-node computer system is operated by monitoring the aging and usage of a plurality of hardware components that are part of the system'"'"'s networked nodes. While monitoring the components, a determination is made that one of the components has aged, worn, or both, to a level that is selected as being close enough to the component'"'"'s predicted end of life in the system so as to prevent failure of the component in the system. A notification is sent to replace the component, in response to the determination. Other embodiments are also described and claimed.
56 Citations
18 Claims
-
1. A data storage system comprising:
-
a plurality of metadata server machines each to store metadata for a plurality of files that are stored in the system;
a plurality of slice server machines to store slices of said files at locations, in the slice server machines, indicated by the metadata;
a packet switching interconnect to which the metadata and slice servers are communicatively coupled;
a distributed file system to be executed in the metadata and slice server machines, the file system to hide complexity of the data storage system from clients; and
software to monitor the aging and usage of hardware components of the data storage system and implement a proactive component replacement process which sends a notification to replace a component in response to determining that the component has aged, worn, or both, to a predetermined level that is selected as being dose enough to the component'"'"'s predicted end of life in the system so as to prevent failure of the component in the system. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer-implemented method for operating a high availability multi-node computer system, comprising:
-
monitoring the aging and usage of a plurality of hardware components that are part of a plurality of networked nodes of the computer system;
while monitoring the components, determining that one of the components has aged, worn, or both, to a level that is selected as being close enough to the component'"'"'s predicted end of life in the system so as to prevent failure of the component in the system; and
sending a notification to replace the component in response to the determination. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A computer-implemented method for operating a multi-node computer system, comprising:
-
monitoring the aging and usage of a plurality of hardware components that make up the computer system;
analyzing degradation of the components including calculating a degradation factor for each component, based on the monitoring; and
calculating each component'"'"'s age or wear level in the system, based on the degradation factor, that predicts when the component should be replaced before it fails in the system. - View Dependent Claims (17)
-
-
18. A method for operating a multi-node data storage system, comprising:
-
receiving a notification from a multi-node data storage system that identifies an operating component of the system as being ready to be replaced; and
scheduling a service visit to replace the identified component from the system.
-
Specification