Fuzzy data operations
First Claim
Patent Images
1. A method for joining data elements from two or more datasets stored in at least one data storage system, the method performed by one or more processors and including:
- determining matches between one or more variants of objects in first data elements from a first dataset and objects in second data elements from a second dataset, with a variant being in accordance with a variant relation between one or more of the objects in the first data elements from the first dataset and one or more of the objects in the second data elements from the second dataset;
evaluating respective second data elements having respective objects determined as matches;
joining at least one of the first data elements from the first dataset with at least one of the second data elements from the second dataset to produce a third data element, with the joining based on the evaluation of the respective second data elements; and
outputting a third dataset including at least a portion of the first data set combined with at least a portion of the second dataset, with the third dataset including one or more third data elements produced by joining.
3 Assignments
0 Petitions
Accused Products
Abstract
A method for clustering data elements stored in a data storage system includes reading data elements from the data storage system. Clusters of data elements are formed with each data element being a member of at least one cluster. At least one data element is associated with two or more clusters. Membership of the data element belonging to respective ones of the two or more clusters is represented by a measure of ambiguity. Information is stored in the data storage system to represent the formed clusters.
107 Citations
19 Claims
-
1. A method for joining data elements from two or more datasets stored in at least one data storage system, the method performed by one or more processors and including:
-
determining matches between one or more variants of objects in first data elements from a first dataset and objects in second data elements from a second dataset, with a variant being in accordance with a variant relation between one or more of the objects in the first data elements from the first dataset and one or more of the objects in the second data elements from the second dataset; evaluating respective second data elements having respective objects determined as matches; joining at least one of the first data elements from the first dataset with at least one of the second data elements from the second dataset to produce a third data element, with the joining based on the evaluation of the respective second data elements; and outputting a third dataset including at least a portion of the first data set combined with at least a portion of the second dataset, with the third dataset including one or more third data elements produced by joining. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system for joining data elements from two or more datasets stored in at least one data storage system, the system including:
-
means for determining matches between one or more variants of objects in first data elements from a first dataset and objects in second data elements from a second dataset, with a variant being in accordance with a variant relation between one or more of the objects in the first data elements from the first dataset and one or more of the objects in the second data elements from the second dataset; means for evaluating respective second data elements having respective objects determined as matches; means for joining at least one of the first data elements from the first dataset with at least one of the second data elements from the second dataset to produce a third data element, with the joining based on the evaluation of the respective second data elements; and means for outputting a third dataset including at least a portion of the first data set combined with at least a portion of the second dataset, with the third dataset including one or more third data elements produced by joining.
-
-
8. A computer-readable hardware storage device storing a computer program for joining data elements from two or more datasets stored in at least one data storage system, the computer program including instructions for causing a computer to:
-
determine matches between one or more variants of objects in first data elements from a first dataset and objects in second data elements from a second dataset, with a variant being in accordance with a variant relation between one or more of the objects in the first data elements from the first dataset and one or more of the objects in the second data elements from the second dataset; evaluate respective second data elements having respective objects determined as matches; join at least one of the first data elements from the first dataset with at least one of the second data elements from the second dataset to produce a third data element, with the joining based on the evaluation of the respective second data elements; and output a third dataset including at least a portion of the first data set combined with at least a portion of the second dataset, with the third dataset including one or more third data elements produced by joining. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A system including:
-
a computer; and computer-readable hardware storage device storing a computer program for joining data elements from two or more datasets stored in at least one data storage system, the computer program including instructions for causing the computer to; determine matches between one or more variants of objects in first data elements from a first dataset and objects in second data elements from a second dataset, with a variant being in accordance with a variant relation between one or more of the objects in the first data elements from the first dataset and one or more of the objects in the second data elements from the second dataset; evaluate respective second data elements having respective objects determined as matches; join at least one of the first data elements from the first dataset with at least one of the second data elements from the second dataset to produce a third data element, with the joining based on the evaluation of the respective second data elements; and output a third dataset including at least a portion of the first data set combined with at least a portion of the second dataset, with the third dataset including one or more third data elements produced by joining. - View Dependent Claims (15, 16, 17, 18, 19)
-
Specification