Method and system for detecting frequent association patterns
First Claim
1. A system for detecting frequent association patterns in a database of tree-structured data, comprising:
- means for automatically generating an initial current set of candidate association patterns for counting;
means for counting association patterns in the database that match with the current set of candidate association patterns;
means for detecting frequent association patterns in the database from the result of said counting; and
means for automatically generating a new current set of candidate association patterns for next counting from said detected frequent association patterns.
1 Assignment
0 Petitions
Accused Products
Abstract
A text-mining system and method automatically extracts useful information from a large set of tree-structured data by generating successive sets of candidate tree-structured association patterns for comparison with the tree-structured data. The number of times is counted that each of the candidate association patterns matches with a tree in the set of tree-structured data in order to determine which of the candidate association patterns frequently matches with a tree in the data set. Each successive set of candidate association patterns is generated from the frequent association patterns determined from the previous set of candidate association patterns.
134 Citations
6 Claims
-
1. A system for detecting frequent association patterns in a database of tree-structured data, comprising:
-
means for automatically generating an initial current set of candidate association patterns for counting;
means for counting association patterns in the database that match with the current set of candidate association patterns;
means for detecting frequent association patterns in the database from the result of said counting; and
means for automatically generating a new current set of candidate association patterns for next counting from said detected frequent association patterns.
-
-
2. A text-mining system for extracting useful concepts from a large volume of text data, comprising:
-
(a) means for parsing sentences in the text data;
(b) means for generating a set of structured trees of text data based on the results of said parsing;
(c) means for automatically generating an initial current set of candidate tree-structured association patterns for counting;
(d) means for counting tree-structured association patterns in the set of structured trees that match with one or more of the candidate tree-structured association patterns in the current counting set of candidate tree-structured association patterns;
(e) means for detecting frequent tree-structured association patterns in the set of structured trees from the result of said counting; and
(f) means for automatically generating a new current set of candidate tree-structured association patterns for next counting from said detected frequent tree-structured association patterns. - View Dependent Claims (3, 4)
(g) means for extracting original text data matching with said frequent tree-structured association patterns.
-
-
4. A text-mining system as in claim 2 for automatically classifying text data into a plurality of categories, and further comprising:
(g) means for detecting text data matching with said frequent tree-structured association patterns to classify the data into categories.
-
5. A method for detecting frequent association patterns in a database of tree-structured data, comprising the steps of:
-
automatically generating an initial current set of candidate association patterns for counting;
counting association patterns in the database that match with the current set of candidate association patterns;
detecting frequent association patterns in the database from the result of said counting; and
automatically generating a new set of current candidate association patterns for next counting from said detected frequent association patterns.
-
-
6. A record medium comprising a program for detecting frequent association patterns in a database of tree-structured data, said program having a computer implement:
-
a function for automatically generating an initial current set of candidate association patterns for counting;
a function for counting association patterns in the database that match with the current set of candidate association patterns;
a function for detecting frequent association patterns in the database from the result of said counting; and
a function for automatically generating a new set of candidate association patterns for next counting from said detected frequent association patterns.
-
Specification