System and method for automatically and iteratively mining related terms in a document through relations and patterns of occurrences
First Claim
1. A system for automatically and iteratively mining related terms in a document di through relations and patterns of occurrences, comprising:
- a database for storing a set of previously identified relations Ri−
1 and a set of previously identified patterns Pi−
1;
a relation identifier that uses the document di and the set of patterns Pi−
1 to derive a new relation ri;
a pattern identifier that uses the document di and the set of relations Ri−
1 and the relation ri for deriving a new pattern pi that has not been predetermined; and
wherein the set of patterns Pi−
1 includes individual patterns pn and is expressed as follows;
1 Assignment
0 Petitions
Accused Products
Abstract
A computer program product is provided as an automatic mining system to identify a set of related terms on the World Wide Web that define a relationship, using the duality concept. Specifically, the mining system iteratively refines pairs of terms that are related in a specific way, and the patterns of their occurrences in web pages. The automatic mining system runs in an iterative fashion for continuously and incrementally refining the relates and their corresponding patterns. In one embodiment, the automatic mining system identifies relations in terms of the patterns of their occurrences in the web pages. The automatic mining system includes a relation identifier that derives new relations, and a pattern identifier that derives new patterns. The newly derived relations and patterns are stored in a database, which begins initially with small seed sets of relations and patterns that are continuously and iteratively broadened by the automatic mining system.
-
Citations
9 Claims
-
1. A system for automatically and iteratively mining related terms in a document di through relations and patterns of occurrences, comprising:
-
a database for storing a set of previously identified relations Ri−
1 and a set of previously identified patterns Pi−
1;
a relation identifier that uses the document di and the set of patterns Pi−
1 to derive a new relation ri;
a pattern identifier that uses the document di and the set of relations Ri−
1 and the relation ri for deriving a new pattern pi that has not been predetermined; and
wherein the set of patterns Pi−
1 includes individual patterns pn and is expressed as follows;
- View Dependent Claims (2, 3)
-
-
4. A computer program product for automatically and iteratively mining related terms in a document di through relationships and patterns of occurrences, comprising:
-
a database for storing a set of previously identified relations Ri−
1 and a set of previously identified patterns Pi−
1;
a relation identifier that uses the document di and the set of patterns Pi−
1 to derive a new relation ri;
a pattern identifier that uses the document di and the set of relations Ri−
1 and the relation ri for deriving a new pattern pi that has not been predetermined; and
wherein the set of patterns Pi−
1 includes individual patterns pn and is expressed as follows;
- View Dependent Claims (5, 6)
-
-
7. A method for automatically and iteratively mining related terms in a document di through relationships and patterns of occurrences, comprising:
-
storing previously identified sets of relations Ri−
1 and patterns Pi−
1;
using the document di and the set of relations Ri−
1 to derive a relation ri;
using the document di and the set of patterns Ri to derive new pattern pi that has not been predetermined; and
wherein defining the pattern Pi−
1 includes expressing the pattern Pi−
1 by a set of individual patterns pn as follows;
- View Dependent Claims (8, 9)
-
Specification