System and method for data mining from relational data by sieving through iterated relational reinforcement
First Claim
1. A method for extracting elements satisfying a predetermined criterion of correlation (a "categorical cluster") from a body of data, the data including a plurality of records, the records containing elements from among a set of common fields, the elements having respective values, some of the values being common to different ones of the records, the method comprising the steps of:
- initializing an associated value for each of the elements of the records;
performing a computation to update the associated values based on the associated values of other elements, the computation causing the associated values to change value in a manner related to a degree of correlation; and
deriving, from the updated associated values, a categorical cluster rule which identifies highly correlated elements.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method are provided for performing the process known as "data mining" on a database of raw data records having common data elements, to obtain categorical cluster rules as to what elements of the data tend to occur in common in multiple records. Initial values are assigned to the elements. In an iterative process, the associated value for each given one of the elements is recalculated based on the values of other elements which occur in records together with the given element. Thus, the associated values will tend to grow for elements occurring together in multiple records. Those common occurrences of elements in multiple records represent categorical cluster rules the owner of the data is likely to want to know about. Thus, these rules may be identified based on the growth of the associated values for the records.
-
Citations
45 Claims
-
1. A method for extracting elements satisfying a predetermined criterion of correlation (a "categorical cluster") from a body of data, the data including a plurality of records, the records containing elements from among a set of common fields, the elements having respective values, some of the values being common to different ones of the records, the method comprising the steps of:
-
initializing an associated value for each of the elements of the records; performing a computation to update the associated values based on the associated values of other elements, the computation causing the associated values to change value in a manner related to a degree of correlation; and deriving, from the updated associated values, a categorical cluster rule which identifies highly correlated elements. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. A system for extracting elements satisfying a predetermined criterion of correlation (a "categorical cluster") from a body of data, the data including a plurality of records, the records containing elements from among a set of common fields, the elements having respective values, some of the values being common to different ones of the records, the system comprising:
-
means for initializing an associated value for each of the elements of the records; means for performing a computation to update the associated values based on the associated values of other elements, the computation causing the associated values to change value in a manner related to a degree of correlation; and means for deriving, from the updated associated values, a categorical cluster rule which identifies highly correlated elements. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
-
-
31. A computer program product, for use with a computer system, for extracting elements satisfying a predetermined criterion of correlation (a "categorical cluster") from a body of data, the data including a plurality of records, the records containing elements from among a set of common fields, the elements having respective values, some of the values being common to different ones of the records, the computer program product comprising:
-
a computer-usable medium; means, provided on the medium, for directing the computer system to initialize an associated value for each of the elements of the records; means, provided on the medium, for directing the computer system to perform a computation to update the associated values based on the associated values of other elements, the computation causing the associated values to change value in a manner related to a degree of correlation; and means, provided on the medium, for directing the computer system to derive, from the updated associated values, a categorical cluster rule which identifies highly correlated elements. - View Dependent Claims (32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45)
-
Specification