Data center cost optimization using predictive analytics
First Claim
1. A computer-implemented method to manage environmental conditions of a data center comprising:
- receiving, at a processor unit of a computer, sensor data from sensors monitoring environmental conditions at a data center, the data center housing operating hardware components that have not yet failed, and receiving reliability data of the hardware components; and
for each hardware component;
deriving, using an analytics model stored in a memory storage unit of the computer, an estimated time to failure of the hardware component, said analytics model being run on a processor unit and trained using machine learning, to correlate a component reliability using learned patterns of component failure, said received reliability data and said sensor data of monitored environmental conditions that the hardware component has been subject to at said data center;
determining, at the processor unit, whether said estimated time to failure of the hardware component exceeds an expected reference life criteria time texp associated with that component, andfor each hardware component having a derived estimated time to failure that does not exceed the expected reference life criteria time texp for the respective component;
computing, using the processor unit, a respective time for incurring a lowest cost to replace or repair the component;
generating, using the processor unit, a candidate modification to one or more environmental conditions of said data center, wherein said candidate modification to said one or more environment conditions minimizes energy usage of operations at the data center and extends a life of the respective component while operating under said candidate modification to one or more environmental conditions at said data center to its respective lowest cost time to replace;
computing, using the processor unit, an energy cost impact of letting the respective component operate under said candidate modified environment condition at said data center; and
after generating a candidate modified environment condition associated with each hardware component having a derived estimated time to failure that does not exceed the expected reference life criteria time texp for the respective component;
selecting, using the processor unit, an environmental condition modification from said generated candidate modified environment conditions, said selected environmental condition modification corresponding to a respective hardware component having a largest computed energy savings impact and running said analytics model on said processor to derive a new estimated time to failure of remaining hardware components having less than largest energy savings impact, said environmental condition modification selection ensuring that the new derived estimated time to failure of each remaining hardware component exceeds its respective said expected reference life criteria time if operating under the selected modification environment condition;
generating, using the processor unit, an output signal for use in modifying said data center environment according to said selected environment condition modification;
modifying said data center environment according to said selected environment condition modification, and scheduling a replacement of the hardware component corresponding to the selected environmental condition modification having the largest computed energy savings impact in the data center based on said computed time for incurring a lowest cost to replace or repair the component.
1 Assignment
0 Petitions
Accused Products
Abstract
A system, method and computer program product for optimizing total cost of ownership (TCO) of a piece of IT equipment, e.g., a hard drive or server, using predictive analytics. The data center environment monitors and measures a number of environment variables, including temperature, Relative Humidity, and corrosion. For each piece of hardware, several pieces of data are assigned, including a criticality measure, an operational cost (function of environment), a static replacement cost, and a downtime cost (function of time). For each piece of hardware, if it has not yet failed, the system predicts a time-to-failure using the environment variables. If predicted time-to-failure exceeds an expected reference life criteria, real time TCO analytics is performed to minimize data center energy usage and/or maximize operational cost-efficiency.
-
Citations
12 Claims
-
1. A computer-implemented method to manage environmental conditions of a data center comprising:
-
receiving, at a processor unit of a computer, sensor data from sensors monitoring environmental conditions at a data center, the data center housing operating hardware components that have not yet failed, and receiving reliability data of the hardware components; and for each hardware component; deriving, using an analytics model stored in a memory storage unit of the computer, an estimated time to failure of the hardware component, said analytics model being run on a processor unit and trained using machine learning, to correlate a component reliability using learned patterns of component failure, said received reliability data and said sensor data of monitored environmental conditions that the hardware component has been subject to at said data center; determining, at the processor unit, whether said estimated time to failure of the hardware component exceeds an expected reference life criteria time texp associated with that component, and for each hardware component having a derived estimated time to failure that does not exceed the expected reference life criteria time texp for the respective component; computing, using the processor unit, a respective time for incurring a lowest cost to replace or repair the component; generating, using the processor unit, a candidate modification to one or more environmental conditions of said data center, wherein said candidate modification to said one or more environment conditions minimizes energy usage of operations at the data center and extends a life of the respective component while operating under said candidate modification to one or more environmental conditions at said data center to its respective lowest cost time to replace; computing, using the processor unit, an energy cost impact of letting the respective component operate under said candidate modified environment condition at said data center; and after generating a candidate modified environment condition associated with each hardware component having a derived estimated time to failure that does not exceed the expected reference life criteria time texp for the respective component; selecting, using the processor unit, an environmental condition modification from said generated candidate modified environment conditions, said selected environmental condition modification corresponding to a respective hardware component having a largest computed energy savings impact and running said analytics model on said processor to derive a new estimated time to failure of remaining hardware components having less than largest energy savings impact, said environmental condition modification selection ensuring that the new derived estimated time to failure of each remaining hardware component exceeds its respective said expected reference life criteria time if operating under the selected modification environment condition; generating, using the processor unit, an output signal for use in modifying said data center environment according to said selected environment condition modification; modifying said data center environment according to said selected environment condition modification, and scheduling a replacement of the hardware component corresponding to the selected environmental condition modification having the largest computed energy savings impact in the data center based on said computed time for incurring a lowest cost to replace or repair the component. - View Dependent Claims (2, 3, 4)
-
-
5. A computer program product comprising:
-
one or more computer readable storage media and program instructions stored on the one or more computer readable storage media, the program instructions, when executed by at least one computer, cause said at least one computer to perform a process for managing environmental conditions of a data center, the program instructions comprising instructions configuring a processor unit of said at least one computer to; receive sensor data from sensors monitoring environmental conditions at a data center, the data center housing operating hardware components that have not yet failed, and receive reliability data of the hardware components; and for each hardware component; derive, using an analytics model stored in the one or more computer readable storage media of the at least one computer, an estimated time to failure of the hardware component, said analytics model being run on a processor unit and trained using machine learning, to correlate a component reliability using learned patterns of component failure, said received reliability data and said sensor data of monitored environmental conditions that the hardware component has been subject to at said data center; determine whether said estimated time to failure of the hardware component exceeds an expected reference life criteria time texp associated with that component, and for each hardware component having a derived estimated time to failure that does not exceed the expected reference life criteria time texp for the respective component; compute a respective time for incurring a lowest cost to replace or repair the component; generate a candidate modification to one or more environmental conditions of said data center, wherein said candidate modification to said one or more environmental conditions minimizes energy usage of operations at the data center and extends a life of the respective component while operating under said modified candidate modification to one or more environmental conditions at said data center to its respective lowest cost time to replace; compute an energy cost impact of letting the respective component operate under said candidate modified environment condition at said data center; and after generating a candidate modified environment condition associated with each hardware component having a derived estimated time to failure that does not exceed the expected reference life criteria time texp for the respective component; select an environmental condition modification from said generated candidate modified environment conditions, said selected environmental condition modification corresponding to a respective hardware component having a largest computed energy savings impact, and run said analytics model on said processor to derive a new estimated time to failure of remaining hardware components having less than largest energy savings impact, said environmental condition modification selection ensuring that the new derived estimated time to failure of each remaining hardware component exceeds its respective said expected reference life criteria time if operating under the selected modification environment condition; generate an output signal for use in modifying said data center environment according to said selected environment condition modification; modify said data center environment according to said selected environment condition modification, and schedule a replacement of the hardware component corresponding to the selected environmental condition modification having the largest computed energy savings impact in the data center based on said computed time for incurring a lowest cost to replace or repair the component. - View Dependent Claims (6, 7, 8)
-
-
9. A computer-implemented system for managing environmental conditions of a data center, the system comprising:
-
a memory storage device storing program instructions; at least one hardware processor coupled to the memory storage device and running said program instructions to configure said at least one hardware processor to; receive sensor data from sensors monitoring environmental conditions at a data center, the data center housing operating hardware components that have not yet failed, and receive reliability data of the hardware components; and for each hardware component; derive, using an analytics model stored in the memory storage device, an estimated time to failure of the hardware component, said analytics model being run on a processor unit and trained using machine learning, to correlate a component reliability using learned patterns of component failure, said received reliability data and said sensor data of monitored environmental conditions at said data center; determine whether said estimated time to failure of the hardware component exceeds an expected reference life criteria time texp associated with the component, and for each hardware component having a derived estimated time to failure that does not exceed the expected reference life criteria time texp for the respective component; compute a respective time for incurring a lowest cost to replace or repair the component; generate a candidate modification to one or more environmental conditions of said data center, wherein said candidate modification to said one or more environmental conditions minimizes energy usage of operations at the data center, and extends a life of the respective component while operating under said candidate modification to one or more environmental conditions at said data center to its respective lowest cost time to replace; compute an energy cost impact to an entity of letting the respective component operate under said candidate modified environment condition at said data center; and after generating a candidate modified environment condition associated with each hardware component having a derived estimated time to failure that does not exceed the expected reference life criteria time texp for the respective component; select an environmental condition modification from said generated candidate modified environment conditions, said selected environmental condition modification corresponding to a respective hardware component having a largest computed energy savings impact and run said analytics model on said processor to derive a new estimated time to failure of remaining hardware components having less than largest energy savings impact, said environmental condition modification selection ensuring that the new derived estimated time to failure of each remaining hardware component exceeds its respective said expected reference life criteria time if operating under the selected modification environment condition; generate an output signal for use in modifying said data center environment according to said selected environment condition modification; modify said data center environment according to said selected environment condition modification, and schedule a replacement of the hardware component corresponding to the selected environmental condition modification having the largest computed energy savings impact in the data center based on said computed time for incurring a lowest cost to replace or repair the component. - View Dependent Claims (10, 11, 12)
-
Specification