Fuzzy Data Operations
First Claim
Patent Images
1. A method for clustering data elements stored in a data storage system, the method including:
- reading data elements from the data storage system;
forming clusters of data elements with each data element being a member of at least one cluster;
associating at least one data element with two or more clusters, with memberships of the data element belonging to respective ones of the two or more clusters being represented by a measure of ambiguity; and
storing information in the data storage system to represent the formed clusters.
3 Assignments
0 Petitions
Accused Products
Abstract
A method for clustering data elements stored in a data storage system includes reading data elements from the data storage system. Clusters of data elements are formed with each data element being a member of at least one cluster. At least one data element is associated with two or more clusters. Membership of the data element belonging to respective ones of the two or more clusters is represented by a measure of ambiguity. Information is stored in the data storage system to represent the formed clusters.
-
Citations
52 Claims
-
1. A method for clustering data elements stored in a data storage system, the method including:
-
reading data elements from the data storage system; forming clusters of data elements with each data element being a member of at least one cluster; associating at least one data element with two or more clusters, with memberships of the data element belonging to respective ones of the two or more clusters being represented by a measure of ambiguity; and storing information in the data storage system to represent the formed clusters. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. A system for clustering data elements stored in a data storage system, the system including:
-
means for reading data elements from the data storage system; means for forming clusters of data elements with each data element being a member of at least one cluster; means for associating at least one data element with two or more clusters, with memberships of the data element belonging to respective ones of the two or more clusters being represented by a measure of ambiguity; and means for storing information in the data storage system to represent the formed clusters.
-
-
23. A computer-readable medium storing a computer program for clustering data elements stored in a data storage system, the computer program including instructions for causing a computer to:
-
read data elements from the data storage system; form clusters of data elements with each data element being a member of at least one cluster; associate at least one data element with two or more clusters, with memberships of the data element belonging to respective ones of the two or more clusters being represented by a measure of ambiguity; and store information in the data storage system to represent the formed clusters.
-
-
24. A method for performing a data operation that receives a key and returns one or more data elements from a data storage system, the method including:
-
determining multiple candidate data elements based on candidate matches between the key and values of one or more search fields of the data elements; and corroborating the candidate matches based on values of one or more comparison fields of the candidate data elements different from the search fields. - View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32, 33)
-
-
34. A system for performing a data operation that receives a key and returns one or more data elements from a data storage system, the system including:
-
means for determining multiple candidate data elements based on candidate matches between the key and values of one or more search fields of the data elements; and means for corroborating the candidate matches based on values of one or more comparison fields of the candidate data elements different from the search fields.
-
-
35. A computer-readable medium storing a computer program for performing a data operation that receives a key and returns one or more data elements from a data storage system, the computer program including instructions for causing a computer to:
-
determine multiple candidate data elements based on candidate matches between the key and values of one or more search fields of the data elements; and corroborate the candidate matches based on values of one or more comparison fields of the candidate data elements different from the search fields.
-
-
36. A method for measuring data quality of data elements in a data storage system, the method including:
-
reading data elements from the data storage system; for each of one or more entries in one or more fields of the data elements, computing a value of a measure of ambiguity for the entry; and outputting a representation of data quality of the data elements in the data storage system based on the values of the measure of ambiguity. - View Dependent Claims (37, 38, 39, 40, 41, 42)
-
-
43. A system for measuring data quality of data elements in a data storage system, the system including:
-
means for reading data elements from the data storage system; means for computing, for each of one or more entries in one or more fields of the data elements, a value of a measure of ambiguity for the entry; and means for outputting a representation of data quality of the data elements in the data storage system based on the values of the measure of ambiguity.
-
-
44. A computer-readable medium storing a computer program for measuring data quality of data elements in a data storage system, the computer program including instructions for causing a computer to:
-
read data elements from the data storage system; for each of one or more entries in one or more fields of the data elements, compute a value of a measure of ambiguity for the entry; and output a representation of data quality of the data elements in the data storage system based on the values of the measure of ambiguity.
-
-
45. A method for joining data elements from two or more datasets stored in at least one data storage system, the method including:
-
determining matches between objects in data elements from a first dataset and objects in data elements from a second dataset based on a variant relation between the objects in the data elements from the first dataset and objects in the data elements from the second dataset; evaluating respective data elements having respective objects determined as matches; and joining the data elements from the first dataset with the data elements from the second dataset based on the evaluation of data elements. - View Dependent Claims (46, 47, 48, 49, 50)
-
-
51. A system for joining data elements from two or more datasets stored in at least one data storage system, the system including:
-
means for determining matches between objects in data elements from a first dataset and objects in data elements from a second dataset based on a variant relation between the objects in the data elements from the first dataset and objects in the data elements from the second dataset; means for evaluating respective data elements having respective objects determined as matches; and means for joining the data elements from the first dataset with the data elements from the second dataset based on the evaluation of data elements.
-
-
52. A computer-readable medium storing a computer program for joining data elements from two or more datasets stored in at least one data storage system, the computer program including instructions for causing a computer to:
-
determine matches between objects in data elements from a first dataset and objects in data elements from a second dataset based on a variant relation between the objects in the data elements from the first dataset and objects in the data elements from the second dataset; evaluate respective data elements having respective objects determined as matches; and join the data elements from the first dataset with the data elements from the second dataset based on the evaluation of data elements.
-
Specification