Method for parsing natural language text with simple links
First Claim
Patent Images
1. A method for improving a processor in communication with a memory storing a program which uses a parser to parse natural language text, said method comprising:
- (a) training said parser by accessing a corpus of labeled utterances;
(b) using said parser to extract details from said corpus, where said details include at least two simple links, where the simple link consists of a source word in the utterance, a target word in the utterance that is distinct from said source word, and a link action, said link action is chosen from a set of link actions which includes at least 2 of Append, Insert Below, Insert Above, and Insert Above and Below;
(c) said parser selects the target word and link action for a simple link by performing determination steps and repeating the determination steps until each source word has an associated simple link, where the determining steps include;
i. finding a common ancestor of a source word and a previous word, finding a left-most descendent of the common ancestor, and assigning the left-most descendent of the common ancestor as the target word of the simple link,ii. determining if a parent of the source word is also a parent of the previous word, and if so, the link action selected is Append;
iii. determining if the parent of the source word is a child of the parent of the previous word, and if so, the link action selected is Insert Below;
iv. finding a child node of the common ancestor that is a parent or ancestor to the source word, and determining a position of the child node, which position will be numbered sequentially, and if the position of the child is 3 or greater, the link action selected is Insert Below;
v. determining if the parent of the source word is the same as the common ancestor and the parent of the target word is not the common ancestor, then the link action is Insert Above; and
vi. determining if none of the previous conditions exist, the link action selected is Insert Above and Below;
(d) said parser uses at least one statistical classifier by training said statistical classifier on said details that were extracted from said corpus;
using said parser to create a language model using said details;
(f) using said language model to generate at least one new simple link for at least one source word in at least one additional utterance by using said statistical classifier to choose a target word and link action for the new simple link; and
(q) outputting the results of said parsing of the additional utterance as an array of simple links with the additional utterance.
0 Assignments
0 Petitions
Accused Products
Abstract
A parser for natural language text is provided. The parser is trained by accessing a corpus of labeled utterances. The parser extracts details of the syntactic tree structures and part of speech tags from the labeled utterances. The details extracted from the tree structures include Simple Links which are the key to the improved efficiency of this new approach. The parser creates a language model using the details that were extracted from the corpus. The parser then uses the language model to parse utterances.
43 Citations
10 Claims
-
1. A method for improving a processor in communication with a memory storing a program which uses a parser to parse natural language text, said method comprising:
-
(a) training said parser by accessing a corpus of labeled utterances; (b) using said parser to extract details from said corpus, where said details include at least two simple links, where the simple link consists of a source word in the utterance, a target word in the utterance that is distinct from said source word, and a link action, said link action is chosen from a set of link actions which includes at least 2 of Append, Insert Below, Insert Above, and Insert Above and Below; (c) said parser selects the target word and link action for a simple link by performing determination steps and repeating the determination steps until each source word has an associated simple link, where the determining steps include; i. finding a common ancestor of a source word and a previous word, finding a left-most descendent of the common ancestor, and assigning the left-most descendent of the common ancestor as the target word of the simple link, ii. determining if a parent of the source word is also a parent of the previous word, and if so, the link action selected is Append; iii. determining if the parent of the source word is a child of the parent of the previous word, and if so, the link action selected is Insert Below; iv. finding a child node of the common ancestor that is a parent or ancestor to the source word, and determining a position of the child node, which position will be numbered sequentially, and if the position of the child is 3 or greater, the link action selected is Insert Below; v. determining if the parent of the source word is the same as the common ancestor and the parent of the target word is not the common ancestor, then the link action is Insert Above; and vi. determining if none of the previous conditions exist, the link action selected is Insert Above and Below; (d) said parser uses at least one statistical classifier by training said statistical classifier on said details that were extracted from said corpus; using said parser to create a language model using said details; (f) using said language model to generate at least one new simple link for at least one source word in at least one additional utterance by using said statistical classifier to choose a target word and link action for the new simple link; and (q) outputting the results of said parsing of the additional utterance as an array of simple links with the additional utterance. - View Dependent Claims (2, 3, 4)
-
-
5. A non-transitory computer-readable storage medium having instructions that develop a parser for use in natural language processing, the instructions comprising:
-
(a) training said parser by accessing a corpus of labeled utterances; (b) using said parser to extract details from said corpus, where said details include at least two simple links, where the simple link consists of a source word in the utterance, a target word in the utterance that is distinct from said source word, and a link action, said link action is chosen from a set of link actions which includes at least 2 of Append, Insert Below, Insert Above, and Insert Above and Below; (c) said parser selects the target word and link action for a simple link by performing determination steps and repeating the determination steps until each source word has an associated simple link, where the determining steps include; i. finding a common ancestor of a source word and a previous word, finding a left-most descendent of the common ancestor, and assigning the left-most descendent of the common ancestor as the target word of the simple link, ii. determining if a parent of the source word is also a parent of the previous word, and if so, the link action selected is Append; iii. determining if the parent of the source word is a child of the parent of the previous word, and if so, the link action selected is Insert Below; iv. finding a child node of the common ancestor that is a parent or ancestor to the source word, and determining a position of the child node, which position will be numbered sequentially, and if the position of the child is 3 or greater, the link action selected is Insert Below; v. determining if the parent of the source word is the same as the common ancestor and the parent of the target word is not the common ancestor, then the link action is Insert Above; and vi. determining if none of the previous conditions exist, the link action selected is Insert Above and Below; (d) said parser uses at least one statistical classifier by training said statistical classifier on said details that were extracted from said corpus; (e) using said parser to create a language model using said details; (f) using said language model to generate at least one new simple link for at least one additional utterance where the new link action is from a set which includes Append, Insert Below, Insert Above, Insert Above and Below; and (g) outputting the results of said parsing of the additional utterance as an array of simple links with the additional utterance. - View Dependent Claims (6, 7, 8)
-
-
9. A method for providing an improved natural language parser to a memory unit of a computer system having a system processor, the method comprising the steps of:
-
training said parser by accessing a corpus of labeled utterances; using said parser to extract details from said corpus, where said details include at least two simple links, where the simple link consists of a source word in the utterance, a target word in the utterance that is distinct from said source word, and a link action, said link action is chosen from a set of link actions which includes at least 2 of Append, Insert Below, Insert Above, and Insert Above and Below; said parser selects the target word and link action for a simple link by performing determination steps and repeating the determination steps until each source word has an associated simple link, where the determining steps include; i. finding a common ancestor of a source word and a previous word, finding a left-most descendent of the common ancestor, and assigning the left-most descendent of the common ancestor as the target word of the simple link, ii. determining if a parent of the source word is also a parent of the previous word, and if so, the link action selected is Append; iii. determining if the parent of the source word is a child of the parent of the previous word, and if so, the link action selected is Insert Below; iv. finding a child node of the common ancestor that is a parent or ancestor to the source word, and determining a position of the child node, which position will be numbered sequentially, and if the position of the child is 3 or greater, the link action selected is Insert Below; v. determining if the parent of the source word is the same as the common ancestor and the parent of the target word is not the common ancestor, then the link action is Insert Above; vi. determining if none of the previous conditions exist, the link action selected is Insert Above and Below; (d) responding to a request from a service to transfer and temporarily store in a memory location a copy of a user utterance configured for effective use of a system processor; and (e) parsing the utterance into an array of simple links by creating at least one simple link between at least two words of the new utterance, where a link action is selected from a set which includes Append, Insert Below, Insert Above, Insert Above and Below; and (f) making the array of simple links available for future requests from a service.
-
-
10. A method for accessing a language model in a data storage system of a computer system having means for reading and writing data from the data storage system, relaying information, and accepting input generated by a user, parsing the user generated input, the method comprising the steps of:
-
(a) initially creating the language model by accessing a corpus of labeled utterances and generating details, said details consisting of a plurality of simple links, each simple link defining a relationship between at least two words of an utterance, where the relationship consists of a source word of an utterance, a target word in the utterance, and a link action, said link action is chosen from a set of link actions which includes Append, Insert Below, Insert Above, and Insert Above and Below; (b) said parser selects the target word and link action for a simple link by performing determination steps and repeating the determination steps until each source word has an associated simple link, where the determining steps include; i. finding a common ancestor of a source word and the previous word, finding a left-most descendent of the common ancestor, and assigning the left-most descendent of the common ancestor as the target word of the simple link; ii. determining if a parent of the source word is also a parent of the previous word, and if so, the link action selected is Append; iii. determining if the parent of the source word is a child of the parent of the previous word, and if so, the link action selected is Insert Below; iv. finding a child node of the common ancestor that is a parent or ancestor to the source word, and determining a position of the child node, which position will be numbered sequentially, and if the position of the child is 3 or greater, the link action selected is Insert Below; v. determining if the parent of the source word is the same as the common ancestor and the parent of the target word is not the common ancestor, then the link action is Insert Above; vi. determining if none of the previous conditions exist, the link action selected is Insert Above and Below; and (c) relaying the resulting array of simple links to further modules which perform specific computer operations.
-
Specification