Managing data integrity using a filter condition
First Claim
1. A computer-implemented method for managing data integrity, the method comprising:
- identifying data objects stored in a first data management system that meet a filter condition, each data object including an identifier that uniquely identifies a data object from other data objects in the first data management system, a set of attributes, and attribute values that correspond to attributes in the attribute set, and the filter condition comprising at least one value of an attribute, the attribute corresponding to an attribute in the attribute set;
identifying data objects stored in a second data management system that meet the filter condition, each data object including an identifier that uniquely identifies a data object from other data objects in the second data management system, a set of attributes, and attribute values that correspond to attributes in the attribute set;
accessing data that identifies excluded data objects that are excluded from managing data integrity even when the excluded data objects meet the filter condition;
based on the accessed data, determining a subset of the data objects identified from the first data management system as meeting the filter condition that excludes the excluded data objects and a subset of the data objects identified from the second data management system as meeting the filter condition that excludes the excluded data objects;
comparing identifiers from the subset of data objects identified from the first data management system with identifiers from the subset of data objects identified from the second data management system to determine whether each data object in one of the data management systems has a corresponding data object in the other data management system;
storing, in electronic storage, comparison results information indicating results of the comparison of the identifiers from the subset of data objects identified from the first data management system with the identifiers from the subset of data objects identified from the second data management system, the comparison results information indicating whether each data object that is included in at least one of the subsets and meets the filter condition is stored in the first data management system, the second data management system, or both the first data management system and the second data management system;
accessing the comparison results information; and
managing the integrity of the identified data objects based on the accessed comparison results information such that each data object that is included in at least one of the subsets and meets the filter condition is included both in the first data management system and in the second data management system, managing the integrity comprising;
when a data object that is included in at least one of the subsets and meets the filter condition occurs only in the first data management system, sending the data object from the first data management system to the second data management system,when a data object that is included in at least one of the subsets and meets the filter condition occurs only in the second data management system, sending the data object from the second data management system to the first data management system,when a data object that is included in at least one of the subsets and meets the filter condition occurs in both the first data management system and in the second data management system, determining whether a first set of attribute values associated with the data object in the first data management system is equal to a second set of attribute values associated with the data object in the second data management system, andwhen the first set of attribute values is not equal to the second set of attribute values, sending the data object from the first data management system to the second data management system.
3 Assignments
0 Petitions
Accused Products
Abstract
Techniques are provided to manage the integrity of data stored in two or more data management systems by detecting inconsistencies between the data management systems. The techniques identify missing records in one or more data management systems by comparing the records in the data management systems. A filter condition is used to identify the records to be compared. For records that exist in two or more data management systems, the techniques identify records that are not identical in the data management systems. A user checkpoint is provided between the identification of missing records and the identification of records that are not identical. The detected inconsistencies also may be corrected.
-
Citations
41 Claims
-
1. A computer-implemented method for managing data integrity, the method comprising:
-
identifying data objects stored in a first data management system that meet a filter condition, each data object including an identifier that uniquely identifies a data object from other data objects in the first data management system, a set of attributes, and attribute values that correspond to attributes in the attribute set, and the filter condition comprising at least one value of an attribute, the attribute corresponding to an attribute in the attribute set; identifying data objects stored in a second data management system that meet the filter condition, each data object including an identifier that uniquely identifies a data object from other data objects in the second data management system, a set of attributes, and attribute values that correspond to attributes in the attribute set; accessing data that identifies excluded data objects that are excluded from managing data integrity even when the excluded data objects meet the filter condition; based on the accessed data, determining a subset of the data objects identified from the first data management system as meeting the filter condition that excludes the excluded data objects and a subset of the data objects identified from the second data management system as meeting the filter condition that excludes the excluded data objects; comparing identifiers from the subset of data objects identified from the first data management system with identifiers from the subset of data objects identified from the second data management system to determine whether each data object in one of the data management systems has a corresponding data object in the other data management system; storing, in electronic storage, comparison results information indicating results of the comparison of the identifiers from the subset of data objects identified from the first data management system with the identifiers from the subset of data objects identified from the second data management system, the comparison results information indicating whether each data object that is included in at least one of the subsets and meets the filter condition is stored in the first data management system, the second data management system, or both the first data management system and the second data management system; accessing the comparison results information; and managing the integrity of the identified data objects based on the accessed comparison results information such that each data object that is included in at least one of the subsets and meets the filter condition is included both in the first data management system and in the second data management system, managing the integrity comprising; when a data object that is included in at least one of the subsets and meets the filter condition occurs only in the first data management system, sending the data object from the first data management system to the second data management system, when a data object that is included in at least one of the subsets and meets the filter condition occurs only in the second data management system, sending the data object from the second data management system to the first data management system, when a data object that is included in at least one of the subsets and meets the filter condition occurs in both the first data management system and in the second data management system, determining whether a first set of attribute values associated with the data object in the first data management system is equal to a second set of attribute values associated with the data object in the second data management system, and when the first set of attribute values is not equal to the second set of attribute values, sending the data object from the first data management system to the second data management system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 36, 37, 38, 39, 40, 41)
-
-
17. A computer system having embodied thereon a computer program configured to manage data integrity, the computer system comprising one or more code segments configured to:
-
identify data objects stored in a first data management system that meet a filter condition, each data object including an identifier that uniquely identifies a data object from other data objects in the first data management system, a set of attributes, and attribute values that correspond to attributes in the attribute set, and the filter condition comprising at least one value of an attribute, the attribute corresponding to an attribute in the attribute set; identify data objects stored in a second data management system that meet the filter condition, each data object including an identifier that uniquely identifies a data object from other data objects in the second data management system, a set of attributes, and attribute values that correspond to attributes in the attribute set; access data that identifies excluded data objects that are excluded from managing data integrity even when the excluded data objects meet the filter condition; based on the accessed data, determine a subset of the data objects identified from the first data management system as meeting the filter condition that excludes the excluded data objects and a subset of the data objects identified from the second data management system as meeting the filter condition that excludes the excluded data objects; compare identifiers from the subset of data objects identified from the first data management system with identifiers from the subset of data objects identified from the second data management system to determine whether each data object in one of the data management systems has a corresponding data object in the other data management system; store, in electronic storage, comparison results information indicating results of the comparison of the identifiers from the subset of data objects identified from the first data management system with the identifiers from the subset of data objects identified from the second data management system, the comparison results information indicating whether each data object that is included in at least one of the subsets and meets the filter condition is stored in the first data management system, the second data management system, or both the first data management system and the second data management system; access the comparison results information; and manage the integrity of the identified data objects based on the accessed comparison results information such that each data object that is included in at least one of the subsets and meets the filter condition is included both in the first data management system and in the second data management system, managing the integrity comprising; when a data object that is included in at least one of the subsets and meets the filter condition occurs only in the first data management system, sending the data object from the first data management system to the second data management system, when a data object that is included in at least one of the subsets and meets the filter condition occurs only in the second data management system, sending the data object from the second data management system to the first data management system, when a data object that is included in at least one of the subsets and meets the filter condition occurs in both the first data management system and in the second data management system, determining whether a first set of attribute values associated with the data object in the first data management system is equal to a second set of attribute values associated with the data object in the second data management system, and when the first set of attribute values is not equal to the second set of attribute values, sending the data object from the first data management system to the second data management system. - View Dependent Claims (18, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)
-
-
19. The computer system of 17 wherein the filter condition is a user-definable filter condition.
-
33. A computer system having embodied thereon a computer program configured to manage data integrity, the computer system comprising one or more code segments configured to:
-
receive a filter condition from a user, the filter condition comprising at least one value of an attribute, the filter attribute corresponding to an attribute included in a set of attributes in one or more data objects stored in a leading data management system, each data object including an identifier that uniquely identifies a data object from other data objects in the leading data management system, and each data object including attribute values that correspond to attributes in the attribute set; extract from the leading data management system a first group of identifiers of data objects, each data object meeting the filter condition; extract, from a contrast data management system that includes stored data objects having i) an identifier that uniquely identifies a data object from other data objects in the contrast data management system, ii) a set of attributes and iii) attribute values that correspond to attributes in the attribute set, a second group of identifiers of data objects, each data object meeting the filter condition; access data that identifies excluded data objects that are excluded from managing data integrity even when the excluded data objects meet the filter condition; based on the accessed data, determine the data objects identified from the leading data management system as meeting the filter condition that excludes the excluded data objects and a subset of the data objects identified from the contrast data management system as meeting the filter condition that excludes the excluded data objects; compare the first group of identifiers of data objects extracted from the leading data management system with the second group of identifiers of data objects extracted from the contrast data management system to determine whether each identifier in the first group represents the same data object as identified by an identifier in the second group of identifiers; store, in electronic storage, comparison results information indicating results of the comparison of the first group of identifiers of data objects extracted from the leading data management system with the second group of identifiers of data objects extracted from the contrast data management system, the comparison results information indicating whether each data object that meets the filter condition is stored in the leading data management system, the contrast data management system, or both the leading data management system and the contrast data management system; access the comparison results information; based on the accessed comparison results information; present a results table, each row in the results table representing a particular data object that meets the filter condition and including information indicative of whether the particular data object that meets the filter condition is stored in only the leading data management system, only the contrast data management system, or both the leading data management system and the contrast data management system; determine a number of the data objects that meet the filter condition and are stored in only the leading data management system; determine a number of the data objects that meet the filter condition and are stored in only the contrast data management system; determine a number of the data objects that meet the filter condition and are stored in both the leading data management system and the contrast data management system; display, along with the results table, the number of the data objects that meet the filter condition and are stored in only the leading data management system, the number of the data objects that meet the filter condition and are stored in only the contrast data management system, and the number of the data objects that meet the filter condition and are stored in both the leading data management system and the contrast data management system; permit the user to determine whether to proceed with managing data integrity based on the presented results table, the displayed number of the data objects that meet the filter condition and are stored in only the leading data management system, the displayed number of the data objects that meet the filter condition and are stored in only the contrast data management system, and the displayed number of the data objects that meet the filter condition and are stored in both the leading data management system and the contrast data management system; after receiving a first indication from the user to proceed, i) extract from the leading data management system a first group of attribute value sets, each attribute value set being associated with a single data object that is included in both the leading data management system and the contrast data management system and that meets the filter condition, ii) extract from the contrast data management system a second group of attribute value sets, each attribute value set being associated with a particular object that is included in both the leading data management system and the contrast data management system and that meets the filter condition, and iii) for each attribute value set extracted from the leading data management system, identify the attribute value set from the contrast data management system that represents the same data object and compare the attribute value set from the leading data management system with the attribute value set from the contrast data management system to determine whether the attribute value sets match, update the results table to present, for each row in the results table representing a data object stored in both the leading data management system and the contrast data management an indication whether the attribute value set from the leading data management system matches the attribute value set from the contrast data management system; determine a number of data objects that are stored in both the leading data management system and the contrast data management system and that have matching attribute value sets; determine a number of data objects that are stored in both the leading data management system and the contrast data management system and that have differing attribute value sets; display, along with the updated results table, the number of data objects that are stored in both the leading data management system and the contrast data management system and that have matching attribute value sets and the number of data objects that are stored in both the leading data management system and the contrast data management system and that have differing attribute value sets; permit the user to determine whether to correct inconsistencies in data objects that meet the filter condition based on the updated results table, the displayed number of data objects that are stored in both the leading data management system and the contrast data management system and that have matching attribute value sets, and the displayed number of data objects that are stored in both the leading data management system and the contrast data management system and that have differing attribute value sets; and after receiving a second indication from the user to proceed, correcting inconsistencies in data objects that meet the filter condition. - View Dependent Claims (34, 35)
-
Specification