System and method for the automatic construction of generalization-specialization hierarchy of terms from a database of terms and associated meanings
First Claim
1. A system for the automatic construction of a generalization-specialization hierarchy of terms from unstructured information, comprising:
- a set of paired terms and meanings associated with the terms that have been derived from the unstructured information;
a terms database for storing the paired set of terms and associated meanings;
a generalization-specialization hierarchy that is defined by a set of edges between the terms stored in the terms database, wherein each edge is determined by a hierarchical relationship between two terms;
a hierarchy database for storing the generalization-specialization hierarchy;
an augmentation module for deriving new terms from the associated meanings, which new terms are not stored in the terms database;
a generalization-specialization detection module for iteratively deriving generalizations and specializations from the terms stored in the terms database and the new terms derived from the associated meanings;
the generalization-specialization module extracting a set of new edges from the generalizations and specializations, and updating the generalization-specialization hierarchy by updating the set of edges with the new edges that have been extracted from the generalizations and specializations;
the hierarchy database storing the generalization-specialization hierarchy that has been updated;
wherein the terms database is updated with the new terms that have been derived from the associated meanings and the generalizations and specializations; and
wherein the hierarchy database is queriable for a desired hierarchical relationship.
1 Assignment
0 Petitions
Accused Products
Abstract
A computer program product is provided as an automatic mining system to build a generalization hierarchy of terms from a database of terms and associated meanings, using for example the Least General Generalization (LGG) model. The automatic mining system is comprised of a terms database, an augmentation module, a generalization detection module, and a hierarchy database. The terms database stores the terms and their meanings, and the hierarchy database stores the generalization hierarchy which is defined by a set of edges and nodes. The augmentation module updates the terms using the LGG model. The generalization detection module maps the generalizations derived by the augmentation module, updates the edges, and derives a generalization hierarchy. In operation, the automatic mining system begins with no predefined taxonomy of the concept terms, and the LGG model derives a generalization hierarchy, modeled as a Directed Acyclic Graph from the terms.
-
Citations
9 Claims
-
1. A system for the automatic construction of a generalization-specialization hierarchy of terms from unstructured information, comprising:
-
a set of paired terms and meanings associated with the terms that have been derived from the unstructured information;
a terms database for storing the paired set of terms and associated meanings;
a generalization-specialization hierarchy that is defined by a set of edges between the terms stored in the terms database, wherein each edge is determined by a hierarchical relationship between two terms;
a hierarchy database for storing the generalization-specialization hierarchy;
an augmentation module for deriving new terms from the associated meanings, which new terms are not stored in the terms database;
a generalization-specialization detection module for iteratively deriving generalizations and specializations from the terms stored in the terms database and the new terms derived from the associated meanings;
the generalization-specialization module extracting a set of new edges from the generalizations and specializations, and updating the generalization-specialization hierarchy by updating the set of edges with the new edges that have been extracted from the generalizations and specializations;
the hierarchy database storing the generalization-specialization hierarchy that has been updated;
wherein the terms database is updated with the new terms that have been derived from the associated meanings and the generalizations and specializations; and
wherein the hierarchy database is queriable for a desired hierarchical relationship. - View Dependent Claims (2, 3)
-
-
4. A computer program product for the automatic construction of a generalization-specialization of terms from unstructured information, comprising:
-
a set of paired terms and meanings associated with the terms that have been derived from the unstructured information;
a terms database for storing the paired set of terms and associated meanings;
a generalization-specialization hierarchy that is defined by a set of edges between the terms stored in the terms database, wherein each edge Is determined by a hierarchical relationship between two terms;
a hierarchy database for storing the generalization-specialization hierarchy;
an augmentation module for deriving new terms from the associated meanings, which new terms are not stored in the terms database;
a generalization-specialization detection module for iteratively deriving generalizations and specializations from the terms stored in the terms database and the new terms derived from the associated meanings;
the generalization-specialization module extracting a set of new edges from the generalizations and specializations, and updating the generalization-specialization hierarchy by updating the set of edges with the new edges that have been extracted from the generalizations and specializations;
the hierarchy database storing the generalization-specialization hierarchy that has been updated;
wherein the terms database is updated with the new terms that have been derived from the associated meanings and the generalizations and specializations; and
wherein the hierarchy database is queriable for a desired hierarchical relationship. - View Dependent Claims (5, 6)
-
-
7. A method for the automatic construction of a generalization-specialization of terms from unstructured information, comprising:
-
deriving a set of paired terms and meanings associated with the terms from the unstructured information;
storing the paired set of terms and associated meanings in a terms database;
defining a generalization-specialization hierarchy by a set of edges between the terms stored in the terms database, wherein each edge is determined by a hierarchical relationship between two terms;
storing the generalization-specialization hierarchy in a hierarchy database;
deriving new terms from the associated meanings, which new terms are not stored in the terms database;
iteratively deriving generalizations and specializations from the terms stored in the terms database and the new terms derived from the associated meanings;
extracting a set of new edges from the generalizations and specializations;
updating the generalization-specialization hierarchy by updating the set of edges with the new edges that have been extracted from the generalizations and specializations;
storing the generalization-specialization hierarchy that has been updated in the hierarchy database;
updating the terms database with the new terms that have been derived from the associated meanings and the generalizations and specializations; and
wherein the hierarchy database is queriable for a desired hierarchical relationship. - View Dependent Claims (8, 9)
-
Specification