Sentence analyzer
First Claim
1. Apparatus for the analysis of digitally encoded natural language, comprisingmeans for receiving encoded data representative of a body of natural language words such encoded data including for each word at least a provisionally-assigned syntactic tag of the word,noun group identifying means operative on the encoded data for identifying noun groups and producing noun data structures representative thereof which augment the encoded data,verb group identifying means operative on said encoded data augmented by the noun data structures for identifying verb groups and for producing predicate data structures representative thereof which further augment the encoded data, andclausal analysis means operative on said encoded data augmented by said noun data structures and said predicate data structures for identifying well formed clauses of said body of words.
12 Assignments
0 Petitions
Accused Products
Abstract
An apparatus for the grammatical anlysis of digitally encoded text material receives encoded text, annotates each word of the text with a tag, and processes the annotated text to identify basic syntactic units such as noun phrases and verb groups. A clausal analyzer then operates on the identified nominal and predicate structures to identify clause boundaries and clause types. During processing, feature agreement between parts of successively larger entities--noun phrases, predicates, and clauses--are successively derived. When an error is detected, an error maessage identifies the error and displays a suggested correction.
188 Citations
28 Claims
-
1. Apparatus for the analysis of digitally encoded natural language, comprising
means for receiving encoded data representative of a body of natural language words such encoded data including for each word at least a provisionally-assigned syntactic tag of the word, noun group identifying means operative on the encoded data for identifying noun groups and producing noun data structures representative thereof which augment the encoded data, verb group identifying means operative on said encoded data augmented by the noun data structures for identifying verb groups and for producing predicate data structures representative thereof which further augment the encoded data, and clausal analysis means operative on said encoded data augmented by said noun data structures and said predicate data structures for identifying well formed clauses of said body of words.
-
10. Apparatus for the grammatical analysis of digitally encoded natural language text, such apparatus comprising
code annotation means for annotating each word of a text with data codes, such code annotation means including a stored dictionary of words together with associated data codes, wherein the data codes include tag codes representing possible grammatical or syntactic used of a word, and feature codes representing agreement properties of a word and such code annotation means also including means for looking up words of the text in the dictionary to identify said tag and feature codes for annotating the words, and noun phrase means operative on the identified tag codes of successive words for identifying noun phrases from the ordering of the tag codes.
-
23. A method of analyzing a digitally encoded body of natural language words for grammatical well-formedness, such method comprising, in order, the steps of
(I) annotating the words with candidate syntactic tags to produce a data structure, (II) inspecting the ordering of the candidate tags to identify noun groups and producing noun data representative thereof thereby augmenting the data structure, (III) processing the annotated words and the noun data to identify verb groups and producing predicate data representative thereof further augmenting the data structure, and (IV) processing the further augmented data structure to identify well-formed clauses of said body of words.
- 24. A method according to claim 223, further comprising the step of displaying an error message when one of said steps of inspecting or processing detects probable errors.
-
26. An automated parser for processing digitally encoded natural language text, such parser comprising
means for annotating words of text with possible grammatical tags, noun means operative on the annotated words of text for identifying noun phrases and for further annotating the text with noun phrase data, predicate means operative on said annotated words of text and noun phrase data for identifying verb groups and for further annotating the text with verb group data, and error detection means included in said means for annotating, said noun means and said predicate means, for identifying text errors during operation of the aforesaid three means and displaying an indication thereof, whereby errors in text are identified for correction during processing stages of ascending structural complexity thereby permitting in-process correction of errors and parsing of unedited text without breakdown of the processing operations or requiring excessive processing time.
-
27. A method for grammatical analysis of digitally encoded natural language text, such method comprising the steps of
providing an automated means for annotating each word of a text with data codes, such means including a stored dictionary of words together with associated data codes, wherein the data codes include tag codes representing possible grammatical or syntactic uses of a word, and feature codes representing agreement properties of a word and such automated means being operative for looking up words of the text in the dictionary to identify said tag and feature codes thereby annotating the words, and providing an automated noun phrase identifier which successively inspects the order of the identified tag codes of successive words to identify noun phrases of the text.
-
28. A method for an automated parsing of digitally encoded natural language text, such method comprising the steps of
annotating words of text with possible grammatical tags, identifying noun phrases and further annotating the text with noun phrase data, identifying verb groups and further annotating the text with verb group data, identifying text errors during the aforesaid three steps and displaying an error indication thereof, and correcting the displayed errors in text during processing stages of ascending structural complexity thereby permitting automated parsing of unedited text without breakdown of the processing operations or requiring excessive processing time.
Specification