Conceptual world representation natural language understanding system and method
First Claim
1. A computer-readable storage medium storing code representing instructions that when executed by a computer cause the computer to:
- receive as an input at least a portion of a document;
extract a meaning contained in the portion of the document based on a comparison of the portion to a plurality of concepts included in a computer implemented formal ontology and arranged in a hierarchy, said hierarchy having a primary concept defined based on a set of criteria, each concept from the plurality of concepts hierarchically linked to a concept below the primary concept being defined based on all criteria from the set of criteria and based on at least one additional criterion different than each criterion from the set of criteria;
associate a first term from the document with a first concept from the plurality of concepts;
associate a second term from the document with a second concept from the plurality of concepts;
prohibit a link between the first term and the second term when a link between the first concept and the second concept is prohibited based on a link between a parent concept of the first concept and a parent concept of the second concept, the link between the first concept and the second concept being a substantially similar type as the link between the parent concept of the first concept and the parent concept of the second concept; and
index the portion of the document based on said meaning extracted from the portion of the document.
3 Assignments
0 Petitions
Accused Products
Abstract
A Natural Language Understanding system is provided for indexing of free text documents. The system according to the invention utilizes typographical and functional segmentation of text to identify those portions of free text that carry meaning. The system then uses words and multi-word terms and phrases identified in the free to text to identify concepts in the free text. The system uses a lexicon of terms linked to a formal ontology that is independent of a specific language to extract concepts from the free text based on the words and multi-word terms in the free text. The formal ontology contains both language independent domain knowledge concepts and language dependent linguistic concepts that govern the relationships between concepts and contain the rules about how language works. The system according to the current invention may preferably be used to index medical documents and assign codes from independent coding systems, such as, SNOMED, ICD-9 and ICD-10. The system according to the current invention may also preferably make use of syntactic parsing to improve the efficiency of the method.
18 Citations
2 Claims
-
1. A computer-readable storage medium storing code representing instructions that when executed by a computer cause the computer to:
-
receive as an input at least a portion of a document; extract a meaning contained in the portion of the document based on a comparison of the portion to a plurality of concepts included in a computer implemented formal ontology and arranged in a hierarchy, said hierarchy having a primary concept defined based on a set of criteria, each concept from the plurality of concepts hierarchically linked to a concept below the primary concept being defined based on all criteria from the set of criteria and based on at least one additional criterion different than each criterion from the set of criteria; associate a first term from the document with a first concept from the plurality of concepts; associate a second term from the document with a second concept from the plurality of concepts; prohibit a link between the first term and the second term when a link between the first concept and the second concept is prohibited based on a link between a parent concept of the first concept and a parent concept of the second concept, the link between the first concept and the second concept being a substantially similar type as the link between the parent concept of the first concept and the parent concept of the second concept; and index the portion of the document based on said meaning extracted from the portion of the document. - View Dependent Claims (2)
-
Specification