Understanding data in data sets
First Claim
Patent Images
1. A computer-implemented method comprising:
- receiving two or more data sets,each of the data sets containing data that can be interpreted as records, the records of each of the data sets each having data values for data fields of the records, the records of each of the data sets having at least one data field that is different from all data fields of the records of another one of the data sets, each data field being identified by a data field identifier,at least one of the data fields of the records of each of the data sets being related to at least one of the data fields of the records of at least one of the other data sets,the data of different ones of the data sets being organized or expressed possibly differently,determining a key for each of the data sets based on one or more data field identifiers of data fields of the data set, the keys for different data sets being different, the data sets being characterized by repetitions of at least one of (a) records, (b) portions of keys, or (c) instances of values for data fields, the keys of the two or more data sets containing information about at least one of the repetitions, andproviding the information about at least one of the repetitions based on the keys of the data sets.
1 Assignment
0 Petitions
Accused Products
Abstract
Among other things, there are two or more data sets. Each of the data sets contains data that can be interpreted as records each having data values for data fields. Each of the data sets contains at least some data that is related to data in at least one of the other data sets. The data in different data sets is organized or expressed possibly differently. Each of the data sets is susceptible to a definition of a key for the records of the data set. The data sets are characterized by repetitions of at least one of (a) records, (b) portions of keys, or (c) instances of values for data fields. Information about at least one of the repetitions is provided to a user.
24 Citations
32 Claims
-
1. A computer-implemented method comprising:
-
receiving two or more data sets, each of the data sets containing data that can be interpreted as records, the records of each of the data sets each having data values for data fields of the records, the records of each of the data sets having at least one data field that is different from all data fields of the records of another one of the data sets, each data field being identified by a data field identifier, at least one of the data fields of the records of each of the data sets being related to at least one of the data fields of the records of at least one of the other data sets, the data of different ones of the data sets being organized or expressed possibly differently, determining a key for each of the data sets based on one or more data field identifiers of data fields of the data set, the keys for different data sets being different, the data sets being characterized by repetitions of at least one of (a) records, (b) portions of keys, or (c) instances of values for data fields, the keys of the two or more data sets containing information about at least one of the repetitions, and providing the information about at least one of the repetitions based on the keys of the data sets. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. A computer-implemented method comprising:
-
receiving a data set containing data that can be interpreted as records, the records of the data set each having data values for data fields of the records, the data set being characterized by any arbitrary number of repetitions of instances of values for at least one of the data fields, each data field being identified by a data field identifier, determining a key for the data set based on one or more data field identifiers of data fields of the data set, the key being different from any other key determined for any other data set and containing information about at least one of the repetitions, and providing the information about at least one of the repetitions based on the key. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30)
-
-
31. A non-transitory medium bearing
an integrated file of data records acquired from a transitory source or a non-transitory source and applied to particular hardware to realize the functionality of the data records, the integrated file being formed by integrating at least two data sets, each of the data sets containing data that can be interpreted as records, the records of each of the data sets each having data values for data fields, the records of each of the data sets having at least one data field that is different from all data fields of the records of another one of the data sets, each data field being identified by a data field identifier, at least one of the data fields of the records of each of the data sets being related to at least one of the data fields of the records of at least one of the other data sets, the data of different ones of the data sets being organized or expressed possibly differently, each of the data sets being susceptible to a key determined based on one or more data field identifiers of data fields of the data set, the keys of different data sets being different the data sets being characterized by repetitions of at least one of (a) records, (b) portions of keys, or (c) instances of values for data fields, the keys containing information about at least one of the repetitions, a file key for the integrated file of data records, the file key being formed based on the different keys of the two or more data sets.
-
32. A computer-implemented method comprising
receiving two or more data files, each of the data files containing data that can be interpreted as records, the records of each of the data files each having data values for data fields of the records, the records of each of the data files having at least one data field that is different from all data fields of the records of another one of the data sets, each data field being identified by a data field identifier, at least one of the data fields of the records of each of the data files being related to at least one of the data fields of the records of at least one of the other data files, the data of at least two of the data files being expressed according to two different file formats, determining a key for each of the data files based on one or more data field identifiers of data fields of the data files, the keys of different data files being different, the data files being characterized by repetitions of at least one of (a) records, (b) portions of keys, or (c) instances of values for data fields, the keys for the records of the two or more data files containing information about at least one of the repetitions displaying to the user records of the data files, identifications of the fields of the records, and indications of the repetitions in data files including repeated instances of values for data fields, and enabling the user to create an integrated file of records that includes the data of the data files and information about the repetitions.
Specification