Method and system of measuring data quality
First Claim
1. A data quality measurement method for use in a data processing stream comprising at least one upstream data processing system, which outputs data for use by at least one downstream data processing system, the method comprising:
- validating data processed by an upstream data processing system prior to processing of the data by a data processing system downstream of the upstream data processing system, such that at least one property of the upstream data is validated; and
performing validation on data processed by the downstream data processing system, comprising;
validating at least one data item from the processed data using a predefined relationship between the at least one data item from the processed data and at least one data item from the upstream data; and
validating two or more data items from the processed data using a predefined relationship between the two or more data items.
9 Assignments
0 Petitions
Accused Products
Abstract
Data quality measurement is provided for use in a data processing stream, which comprises at least one upstream data processing system and at least one downstream data processing system. An input alert component can be used to provide a measurement of data prior to its input to a data processing system (e.g., a downstream data processing system or an upstream data processing system). An output alert component can be used to provide a measurement on data output by a data processing system. A self-consistency component can be used to measure consistency between items of input, or output data. An end-to-end component can be used to measure data quality using data items from both input data and output data. These components can be used in some combination, or independent of the other, and in any order. In addition, the data quality measurements can be performed separate from that processing performed by either the upstream or downstream processing system.
22 Citations
26 Claims
-
1. A data quality measurement method for use in a data processing stream comprising at least one upstream data processing system, which outputs data for use by at least one downstream data processing system, the method comprising:
-
validating data processed by an upstream data processing system prior to processing of the data by a data processing system downstream of the upstream data processing system, such that at least one property of the upstream data is validated; and performing validation on data processed by the downstream data processing system, comprising; validating at least one data item from the processed data using a predefined relationship between the at least one data item from the processed data and at least one data item from the upstream data; and validating two or more data items from the processed data using a predefined relationship between the two or more data items. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17)
-
-
10. A data quality measurement method for use in a data processing stream comprising at least one upstream data processing system, which outputs data for use by at least one downstream data processing system, comprises the steps of:
-
performing a pre-processing data quality measurement on upstream data comprising comparing at least one property of the upstream data to a benchmark corresponding to the at least one property of the upstream data; performing a processed data quality measurement on data resulting from processing the upstream data, the processed data quality measurement comprising; comparing at least one property of the resulting data with a benchmark corresponding to the at least one property of the resulting data; validating at least one resulting data item using a predefined relationship between the at least one resulting data item and at least one upstream data item; and validating two or more resulting data items using a predefined relationship between the two or more resulting data items.
-
-
18. A data quality measurement device for use in a data processing stream, the data processing stream comprising at least one upstream data processing system, which outputs data for use by at least one downstream data processing system, the method comprising:
-
an alert module configured to validate data processed by an upstream data processing system prior to processing of the data by a data processing system downstream of the upstream data processing system, such that at least one property of the upstream data is validated; an end-to-end module configured validate at least one data item from data processed by the downstream data processing system, the processed data using a predefined relationship between the at least one data item from the processed data and at least one data item from the upstream data; and a self-consistency module configured to validate two or more data items from the processed data using a predefined relationship between the two or more data items. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26)
-
Specification