Deep analysis of natural language questions for question answering system
First Claim
Patent Images
1. A method for creating training data for a natural language processing system comprising:
- obtaining natural language input, the natural language input annotated with one or more important phrases;
generating, by a processor, training instances comprising a syntactic parse tree of nodes representing elements of the natural language input augmented with the annotated important phrases,the generating comprising at least creating a syntactic node, which augments the syntactic parse tree output by an information extraction pipeline, for a named mention annotated as an important phrase, responsive to determining that no existing non-terminal nodes in the syntactic parse tree span exactly the named mention; and
training a classifier based on the generated training instances,wherein the natural language processing system executes the classifier in determining a semantic meaning of a given query.
1 Assignment
0 Petitions
Accused Products
Abstract
Creating training data for a natural language processing system may comprise obtaining natural language input, the natural language input annotated with one or more important phrases; and generating training instances comprising a syntactic parse tree of nodes representing elements of the natural language input augmented with the annotated important phrases. In another aspect, a classifier may be trained based on the generated training instances. The classifier may be used to predict one or more potential important phrases in a query.
-
Citations
18 Claims
-
1. A method for creating training data for a natural language processing system comprising:
-
obtaining natural language input, the natural language input annotated with one or more important phrases; generating, by a processor, training instances comprising a syntactic parse tree of nodes representing elements of the natural language input augmented with the annotated important phrases, the generating comprising at least creating a syntactic node, which augments the syntactic parse tree output by an information extraction pipeline, for a named mention annotated as an important phrase, responsive to determining that no existing non-terminal nodes in the syntactic parse tree span exactly the named mention; and training a classifier based on the generated training instances, wherein the natural language processing system executes the classifier in determining a semantic meaning of a given query. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method for natural language processing comprising steps of:
-
receiving a natural language query; creating, by one or more processors, a query syntactic tree for the query; using, by one or more of the processors, a trained model comprising a model syntactic tree to predict if a node in the query syntactic tree is important, wherein the model syntactic tree comprises a syntactic parse tree of nodes representing syntactic elements of natural language input augmented with annotated important phrases, the model syntactic tree generated by at least creating a syntactic node, which augments the syntactic parse tree created by an information extraction pipeline for a named mention annotated as an important phrase, responsive to determining that no existing non-terminal nodes of the syntactic parse tree span exactly the named mention, wherein a semantic meaning of the natural language query is determined based on the model syntactic tree. - View Dependent Claims (7, 8, 9)
-
-
10. A non-transitory computer readable storage medium storing a program of instructions executable by a machine to perform a method of creating training data for a natural language processing system comprising:
-
obtaining natural language input, the natural language input annotated with one or more important phrases; generating training instances comprising a syntactic parse tree of nodes representing elements of the natural language input augmented with the annotated important phrases, the generating comprising at least creating a syntactic node, which augments the syntactic parse tree created by an information extraction pipeline for a named mention annotated as an important phrase, responsive to determining that no existing non-terminal nodes in the syntactic parse tree span exactly the named mention; and training a classifier based on the generated training instances, wherein the natural language processing system executes the classifier in determining a semantic meaning of a given query. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. A system for creating training data for a natural language processing system comprising:
-
a processor; a training module operable to execute on the processor and further operable to obtain natural language input, the natural language input annotated with one or more important phrases, the training module further operable to generate training instances comprising a syntactic parse tree of nodes representing elements of the natural language input augmented with the annotated important phrases; and a classifier built based on the training instances with plurality of features computed for the training instances, wherein the training module is operable to generate training instances by at least creating a syntactic node, which augments the syntactic parse tree created by an information extraction pipeline, for a named mention annotated as an important phrase, responsive to determining that no existing non-terminal nodes in the syntactic parse tree span exactly the named mention, wherein a semantic meaning of a given natural language query is determined based on the augmented syntactic parse tree. - View Dependent Claims (17, 18)
-
Specification