System and method to predict reliability of backup software
First Claim
1. A computer-implemented method of measuring reliability of a deduplication backup program executed by a backup server in an enterprise-level network including a plurality of backup appliances including storage nodes, comprising:
- monitoring, in a capture process executed in a processor of a reliability module in the backup server, a plurality of components in the backup appliances and the deduplication backup program for an overall monitoring period;
measuring, in a data analyzer process executed in the processor, a functional performance of each of the plurality of components based on a failure of a storage node due to a failure type of a plurality of failure types, wherein the functional performance is measured through continuous daemon processes on a periodic basis defined by a time period;
comparing the measured functional performance to defined performance values to identify a specific failed component of the storage node to distinguish from failure of the deduplication backup program as a whole; and
deriving a reliability measure by summing a weighted value of a total number of failures of the plurality of failure types per the overall monitoring period through a formula that sums weighted failure types multiplied by the total number of failures per a total time period, and as expressed by;
Reliability factor=Σ
Weighting to a failure*No of failure/Total Time in hour,to provide a measure of analytical metrics that can be used to analyze overall network performance to identify failure patterns caused by the plurality of components to thereby improve the reliability of the deduplication backup program.
9 Assignments
0 Petitions
Accused Products
Abstract
Embodiments are directed to method of determining the reliability of a software program by correlating reliability with performance of the system through monitoring the entire system and it components. A component captures memory usage and CPU utilization of all components at regular interval and records failure of services by the system. An analyzer analyzes the events performed to determine which component failed to complete the action and record the failure against that component to enable identification of individual component reliability as well as the product as a whole.
51 Citations
18 Claims
-
1. A computer-implemented method of measuring reliability of a deduplication backup program executed by a backup server in an enterprise-level network including a plurality of backup appliances including storage nodes, comprising:
-
monitoring, in a capture process executed in a processor of a reliability module in the backup server, a plurality of components in the backup appliances and the deduplication backup program for an overall monitoring period; measuring, in a data analyzer process executed in the processor, a functional performance of each of the plurality of components based on a failure of a storage node due to a failure type of a plurality of failure types, wherein the functional performance is measured through continuous daemon processes on a periodic basis defined by a time period; comparing the measured functional performance to defined performance values to identify a specific failed component of the storage node to distinguish from failure of the deduplication backup program as a whole; and deriving a reliability measure by summing a weighted value of a total number of failures of the plurality of failure types per the overall monitoring period through a formula that sums weighted failure types multiplied by the total number of failures per a total time period, and as expressed by;
Reliability factor=Σ
Weighting to a failure*No of failure/Total Time in hour,to provide a measure of analytical metrics that can be used to analyze overall network performance to identify failure patterns caused by the plurality of components to thereby improve the reliability of the deduplication backup program. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system comprising a processor-based executable module configured to derive a reliability metric of a deduplication backup program executed by backup server in an enterprise-level network including a plurality of backup appliances including storage nodes, comprising:
-
a processor of the backup server having an administrator module executing first program instructions to monitor a plurality of components in the backup appliances and the deduplication backup program during a monitoring time period; a capture module of the processor executing second program instructions in the processor to capture data and events of the components through continuous daemon processes running on a periodic basis defined a measurement period; a failure analytics module of the processor executing third program instructions to analyze a log of the data and events to capture failure conditions to determine which process failed an expected operation causing failure of a storage node; a data analyzer of the processor evaluating performance data of the software program to identify a specific failed component of the storage node to distinguish from failure of the deduplication backup program as a whole; and and reliability module executed by the processor and deriving a reliability measure using a weighted value formula, and deriving a reliability measure through a formula that sums weighted failure types multiplied by a number of failures per a total time period, and as expressed by;
Reliability factor=Σ
Weighting to a failure*No of failure/Total Time in hour,to provide a measure of analytical metrics that can be used to analyze overall network performance to identify failure patterns caused by the plurality of components to thereby improve the reliability of the deduplication backup program. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
-
18. A computer program product, comprising a non-transitory computer-readable medium having a computer-readable program code embodied therein, the computer-readable program code adapted to be executed by one or more processors to implement a method for measuring reliability of a deduplication backup program executed by a backup server in an enterprise-level network including a plurality of backup appliances including storage nodes by:
-
monitoring, in a capture process executed in a processor of a reliability module in the backup server, a plurality of components in the backup appliances and the deduplication backup program for an overall monitoring period; measuring, in a data analyzer process executed in the processor, a functional performance of each of the plurality of components based on a failure of a storage node due to a failure type of a plurality of failure types, wherein the functional performance is measured through continuous daemon processes on a periodic basis defined by a time period; comparing the measured functional performance to defined performance values to identify a specific failed component of the storage node to distinguish from failure of the deduplication backup program as a whole; and deriving a reliability measure by summing a weighted value of a total number of failures of the plurality of failure types per the overall monitoring period through a formula that sums weighted failure types multiplied by the total number of failures per a total time period, and as expressed by;
Reliability factor=Σ
Weighting to a failure*No of failure/Total Time in hour,to provide a measure of analytical metrics that can be used to analyze overall network performance to identify failure patterns caused by the plurality of components to thereby improve the reliability of the deduplication backup program.
-
Specification