Impact analysis
First Claim
1. A computer-implemented method, the method including:
- receiving data lineage information for multiple components representing at least two logical datasets and a transformation, with the data lineage information identifying a first logical dataset from which the transformation is to receive data and a second logical data dataset to which transformed data is to be provided, the transformation represented by a component in the data lineage information including one or more rules to be applied to data from the first logical dataset, and with the data lineage information identifying paths and flow traces of data though the multiple components;
receiving data specifying one or more proposed changes to a field in the first logical dataset, a field in the second logical dataset, or the transformation;
analyzing the multiple components according to the data lineage information to identify each component affected by the one or more proposed changes;
for at least one identified component identified as affected by the one or more proposed changes,generating an impact metric representing a number of times a field specified by the one or more proposed changes is referenced within the at least one identified component;
determining, based on the generated impact metric, an impact of implementing the one or more proposed changes to one or more of the multiple components; and
storing information about the impact metric.
3 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for impact analysis. One of the methods includes receiving information about at least two logical datasets, the information identifying, for each logical dataset, a field in that logical dataset and format information about that field. The method includes receiving information about a transformation identifying a first logical dataset from which the transformation is to receive data and a second logical dataset to which the transformed data is provided. The method includes receiving one or more proposed changes to at least one of the fields. The method includes analyzing the proposed changes based on information about the transformation and information about the first logical dataset and the second logical dataset. The method includes calculating metrics of the proposed change based on the analysis. The method also includes storing information about the metrics.
-
Citations
17 Claims
-
1. A computer-implemented method, the method including:
-
receiving data lineage information for multiple components representing at least two logical datasets and a transformation, with the data lineage information identifying a first logical dataset from which the transformation is to receive data and a second logical data dataset to which transformed data is to be provided, the transformation represented by a component in the data lineage information including one or more rules to be applied to data from the first logical dataset, and with the data lineage information identifying paths and flow traces of data though the multiple components; receiving data specifying one or more proposed changes to a field in the first logical dataset, a field in the second logical dataset, or the transformation; analyzing the multiple components according to the data lineage information to identify each component affected by the one or more proposed changes; for at least one identified component identified as affected by the one or more proposed changes, generating an impact metric representing a number of times a field specified by the one or more proposed changes is referenced within the at least one identified component; determining, based on the generated impact metric, an impact of implementing the one or more proposed changes to one or more of the multiple components; and storing information about the impact metric. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system comprising:
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; receiving a data lineage information for multiple components representing at least two logical datasets and a transformation, with the data lineage information identifying a first logical dataset describing characteristics of a first physical dataset from which the transformation is to receive data and a second logical dataset describing characteristics of a second physical dataset to which transformed data is to be provided, the transformation represented by a component in the data lineage information including one or more rules to be applied to data from the first logical dataset, and with the data lineage information identifying paths and flow traces of data though the multiple components; receiving data specifying one or more proposed changes to a field in the first logical dataset, a field in the second logical dataset, or the transformation; analyzing the multiple components according to the data lineage information to identify each component affected by the one or more proposed changes; for at least one component identified as affected by the one or more proposed changes, generating an impact metric representing a number of times a field specified by the one or more proposed changes is referenced within the at least one identified component; determining, based on the generated impact metric, an impact of implementing the one or more proposed changes to one or more of the multiple components; and storing information about the impact metric. - View Dependent Claims (8, 9, 10, 11)
-
12. A system comprising:
-
one or more processor devices; and memory operatively coupled to the one or more processor devices, storing a computer program that configures the system to; receive data lineage information for multiple components representing at least first and second logical datasets and a transformation, with the data lineage information identifying a first physical dataset from which the transformation is to receive data and a second physical data dataset to which transformed data is to be provided, the transformation represented by a component in the data lineage information including one or more rules to be applied to data from the first logical dataset, and with the data lineage information identifying paths and flow traces of data though the multiple components; receive data specifying one or more proposed changes to a field in the first logical dataset, the second logical dataset, or the transformation; analyze the multiple components according to the data lineage information to identify each component affected by the one or more proposed changes; for at least one identified component identified as affected by the one or more proposed changes, generate an impact metric representing a number of times a field specified by the one or more proposed changes is referenced within the at least one identified component; determine, based on the generated impact metric, an impact of implementing one or more of the multiple components; and store information about the impact metric.
-
-
13. A computer storage medium encoded with computer program instructions that when executed by one or more computers cause the one or more computers to perform operations comprising:
-
receiving data lineage information for multiple components representing at least two logical datasets and a transformation, with the data lineage information identifying a first logical dataset describing characteristics of a first physical dataset from which the transformation is to receive data and a second logical dataset describing characteristics of a second physical dataset to which transformed data is to be provided, the transformation represented by a component in the data lineage information including one or more rules to be applied to data from the first logical dataset, and with the data lineage information identifying paths and flow traces of data though the multiple components; receiving data specifying one or more proposed changes to a field in the first logical dataset, a field in the second logical dataset, or the transformation; analyzing the multiple components according to the data lineage information to identify each component affected by the one or more proposed changes; for at least one component identified that represents a transformation, for at least one identified component identified as affected by the one or more proposed changes, generating an impact metric representing a number of times a field specified by the one or more proposed changes is referenced within the at least one identified component; determining, based on the generated impact metric, an impact of implementing the one or more proposed changes to one or more of the multiple components; and storing information about the impact metric. - View Dependent Claims (14, 15, 16, 17)
-
Specification