Natural language understanding system
First Claim
1. A method of processing natural language text, comprisingproviding electronically encoded data representative of the natural language text,lexically processing the electronically encoded data with reference to a lexicon data base, said lexicon data base being comprised of lexical entries all including syntactic category data and semantically significant lexical entries including one or more concepts, to produce lexical specifications,interpreting the lexical specifications with reference to an electronic representation of an Augmented Transition Network to produce configuration data, said configuration data including one or more concepts obtained from the lexical specifications, andsemantically processing the configuration data with reference to case frame templates each identified with a respective concept, to produce case frames in accordance with the concepts included in said configuration data.
2 Assignments
0 Petitions
Accused Products
Abstract
A hybrid natural language understanding (NLU) system which is particularly designed for processing natural language text. Primary functional components of the NLU system include a preprocessor; a word look-up and morphology module which communicates with a lexicon and a learning module; a syntactic parser which interfaces with an augmented transition network (ATN) grammar; a case frame applier, which converts the syntactic structure into canonical, semantic "case frames"; and a discourse analysis component which integrates explicit and implied information in the text into a conceptual structure which represents its meaning. This structure may be passed on to a knowledge based system, data base, to interested analysts or decision makers, etc. Significant feedback points are provided, e.g., the case frame applier may notify the syntactic parser of a semantically incorrect parse, or the syntactic parser may seek a semantic judgment based on a fragmentary parse. This system incorporates a novel semantic analysis approach based largely on case grammar.
670 Citations
65 Claims
-
1. A method of processing natural language text, comprising
providing electronically encoded data representative of the natural language text, lexically processing the electronically encoded data with reference to a lexicon data base, said lexicon data base being comprised of lexical entries all including syntactic category data and semantically significant lexical entries including one or more concepts, to produce lexical specifications, interpreting the lexical specifications with reference to an electronic representation of an Augmented Transition Network to produce configuration data, said configuration data including one or more concepts obtained from the lexical specifications, and semantically processing the configuration data with reference to case frame templates each identified with a respective concept, to produce case frames in accordance with the concepts included in said configuration data.
-
20. A method for developing natural language processing systems of the type wherein the following steps are effected:
-
providing electronically encoded data representative of the natural language text, lexically processing the electronically encoded data with reference to a lexicon, said lexicon being comprised of lexical entries wherein semantically significant lexical entries include one or more concepts, to produce lexical specifications, interpreting the lexical specifications with reference to an electronic representation of an ATN grammar specification to produce configuration data, said configuration data including concepts obtained from the lexical specifications, and semantically processing the configuration data with reference to case frame data base containing case frame templates each identified with a respective concept, to produce case frames in accordance with the concepts included in said configuration data; said method comprising the step of modifying one or more of the lexicon data base, ATN grammar specification, and case frame data base. - View Dependent Claims (21, 22, 23, 24)
-
-
25. A method of processing natural language text, comprising
providing electronically encoded data representative of the natural language text, lexically processing the electronically encoded data with reference to a lexicon data base, said lexicon data base being comprised of lexical entries all including syntactic category data and semantically significant lexical entries including one or more concepts, to produce lexical specifications, interpreting the lexical specifications with reference to an electronic representation of a grammar specification to produce output data representative of a grammatical parse of the natural language text, said output data including concepts obtained from the lexical specifications, and semantically processing the output data with reference to case frame templates each identified with a respective concept and including one or more roles associated with such concept, to produce case frames in accordance with the concepts included in said configuration data.
-
39. Apparatus for processing natural language text, comprising
means for providing electronically encoded data representative of the natural language text; -
lexicon data base means comprising a plurality of lexical entries, wherein said lexical entries are comprised of syntactic category data and semantically significant lexical entries are also comprised of one or more concepts; means for lexically processing the electronically encoded data by reference to the lexicon data base means to produce lexical specifications; parser means for interpreting the lexical specifications with reference to an Augmented Transition Network grammar specification to produce configuration data, said configuration data including concept data obtained from the lexical specifications; case frame means for providing a plurality of case frame templates each identified with a respective concept; and means for semantically processing the configuration data by reference to the case frame means to produce case frames in accordance with the concepts included in the configuration data. - View Dependent Claims (40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54)
-
-
55. Apparatus for processing natural language text, comprising
means for providing electronically encoded data representative of the natural language text; -
lexicon data base means comprising a plurality of lexical entries, wherein said lexical entries are comprised of syntactic category data and semantically significant lexical entries are also comprised of one or more concepts; means for lexically processing the electronically encoded data by reference to the lexicon data base means to produce lexical specifications; parser means for interpreting the lexical specifications with reference to an electronically encoded grammar specification to produce output data representative of a grammatical parse of the natural language text, said output data including concepts obtained from the lexical specifications; case frame means for providing a plurality of case frame templates each identified with a respective concept and including one or more roles; and means for semantically processing the configuration data by reference to the case frame means to produce case frames in accordance with the concepts included in the configuration data. - View Dependent Claims (56, 57, 58, 59, 60, 61, 62, 63, 64, 65)
-
Specification