Method and apparatus for aligning multiple taxonomies
First Claim
Patent Images
1. A method comprising:
- mapping a first set of concept nodes in a first taxonomy to a second set of concept nodes in a second taxonomy by aligning nodes with equivalent concepts in the first taxonomy and second taxonomy to generate a master taxonomy having a plurality of mapped concept nodes, the first set of concept nodes for organizing a first plurality of documents and the second set of concept nodes for organizing a second plurality of documents, each of the first and second plurality of documents associated with a document gloss, each of the mapped concept nodes containing documents from a concept node from the first taxonomy and a concept node from the second taxonomy that are determined to contain equivalent categories of documents;
after mapping the first set of concept nodes to the second set of concept nodes,finding an expanded concept that is instantiated disproportionately in the document glosses of an unmapped node of the first taxonomy;
determining if the expanded concept is instantiated in documents not classified at any leaf node in the second taxonomy; and
creating a new node with the expanded concept in the master taxonomy;
placing documents from the unmapped node and documents associated with the expanded concept from the second taxonomy not classified at any leaf node under the new node if the expanded concept is instantiated in documents not classified at any leaf node in the second taxonomy.
9 Assignments
0 Petitions
Accused Products
Abstract
A document taxonomy alignment system and method, relying on document glosses and utilizing a soft ontology expansion. An all-new hierarchical leaf node can be created expressly for the purpose of better aligning the plurality of document taxonomies in question. A small but valuable subset of the nodes created by soft ontology expansion turn out to capture some otherwise unmappable taxonomy nodes, and thereby have the effect of classifying the documents better than would any pre-existing node in any one of those taxonomies.
-
Citations
18 Claims
-
1. A method comprising:
-
mapping a first set of concept nodes in a first taxonomy to a second set of concept nodes in a second taxonomy by aligning nodes with equivalent concepts in the first taxonomy and second taxonomy to generate a master taxonomy having a plurality of mapped concept nodes, the first set of concept nodes for organizing a first plurality of documents and the second set of concept nodes for organizing a second plurality of documents, each of the first and second plurality of documents associated with a document gloss, each of the mapped concept nodes containing documents from a concept node from the first taxonomy and a concept node from the second taxonomy that are determined to contain equivalent categories of documents; after mapping the first set of concept nodes to the second set of concept nodes, finding an expanded concept that is instantiated disproportionately in the document glosses of an unmapped node of the first taxonomy; determining if the expanded concept is instantiated in documents not classified at any leaf node in the second taxonomy; and creating a new node with the expanded concept in the master taxonomy; placing documents from the unmapped node and documents associated with the expanded concept from the second taxonomy not classified at any leaf node under the new node if the expanded concept is instantiated in documents not classified at any leaf node in the second taxonomy. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer readable storage medium having computer executable instructions recorded thereon which cause a computer system to carry out a method when executed, the method comprising:
-
mapping a first set of concept nodes in a first taxonomy to a second set of concepts in a second taxonomy by aligning nodes with equivalent concepts in the first taxonomy and second taxonomy to generate a master taxonomy having a plurality of mapped concept nodes, the first set of concept nodes for organizing a first plurality of documents and the second set of concept nodes for organizing a second plurality of documents, each of the first and second plurality of documents associated with a document gloss, each of the mapped concept nodes containing documents from a concept node from the first taxonomy and a concept node from the second taxonomy that are determined to contain equivalent categories for documents; after mapping the first set of concept nodes to the second set of concept nodes, finding an expanded concept that is instantiated disproportionately in the document glosses of an unmapped node of the first taxonomy; determining if the expanded concept is instantiated in documents not classified at any leaf node in the second taxonomy; creating a new node with the expanded concept in the master taxonomy; and placing the documents from the unmapped node and documents associated with the expanded concept under the new node if the expanded concept from the second taxonomy not classified at any leaf node is instantiated in documents not classified at a leaf node in the second taxonomy. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer system for mapping a taxonomy to at least one other taxonomy, wherein the computer system comprises a non-transitory computer readable storage medium having computer executable instructions recorded thereon which cause a computer system to carry out a method, the taxonomies including concepts for organizing information, the computer system comprising:
-
means for mapping a first set of concept nodes in a first taxonomy to a second set of concepts in a second taxonomy by aligning nodes with equivalent concepts in the first taxonomy and second taxonomy to generate a master taxonomy having a plurality of mapped concept nodes, the first set of concept nodes for organizing a first plurality of documents and the second set of concept nodes for organizing a second plurality of documents, each of the first and second plurality of documents associated with a document gloss, each of the mapped concept nodes containing documents from a concept node from the first taxonomy and a concept node from the second taxonomy that are determined to contain equivalent categories of documents; after mapping the first set of concept nodes to the second set of concept nodes, a concept finding module to find an expanded concept that is instantiated disproportionately in the document glosses of an unmapped node of the first taxonomy; a test module to determine if the expanded concept is instantiated in documents not classified at a leaf node in the second taxonomy; and a node creation module to create a new node with the expanded concept in the master taxonomy and place documents from the unmapped node and documents associated with the expanded concept under the new node if the expanded concept from the second taxonomy not classified at any leaf node is instantiated in documents not classified at a leaf node in the second taxonomy. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification