Fact recognition system
First Claim
Patent Images
1. A method in a data processing system, comprising the steps of:
- receiving a set of syntactic trees reflecting a syntactic structure of a first plurality of sentences;
generating a syntactic model reflecting a likelihood of the syntactic structure;
receiving semantically annotated sentences reflecting semantic information for a second plurality of sentences;
identifying a most likely syntactic structure for each of the semantically annotated sentences by using the syntactic model;
augmenting the identified syntactic structures to include the semantic information for each of the second plurality of sentences;
generating a statistical model reflecting a likelihood of both the semantic information and the identified syntactic structure of the second plurality of sentences;
receiving a new sentence containing a fact; and
recognizing the fact in the new sentence by using the statistical model.
19 Assignments
0 Petitions
Accused Products
Abstract
In accordance with methods and systems consistent with the present invention, an improved fact recognition system is provided that automatically learns from syntactic language examples and semantic language examples, thus facilitating development of the system. The language examples are rather simplistic and can be provided by a lay person with little training, thus relieving the need for knowledge engineers. Furthermore, the learning performed by the improved fact recognition system results in a collection of probabilities that is used by the system to recognize facts in a typically more accurate manner than conventional systems.
109 Citations
5 Claims
-
1. A method in a data processing system, comprising the steps of:
-
receiving a set of syntactic trees reflecting a syntactic structure of a first plurality of sentences;
generating a syntactic model reflecting a likelihood of the syntactic structure;
receiving semantically annotated sentences reflecting semantic information for a second plurality of sentences;
identifying a most likely syntactic structure for each of the semantically annotated sentences by using the syntactic model;
augmenting the identified syntactic structures to include the semantic information for each of the second plurality of sentences;
generating a statistical model reflecting a likelihood of both the semantic information and the identified syntactic structure of the second plurality of sentences;
receiving a new sentence containing a fact; and
recognizing the fact in the new sentence by using the statistical model. - View Dependent Claims (2, 3)
storing the fact in a data store.
-
-
3. The method of claim 1, wherein the recognizing step includes:
-
generating a plurality of parse trees for the new sentence, each parse tree reflecting likely semantic information and likely syntactic structure for at least a portion of the new sentence;
selecting from among the plurality of parse trees the parse tree having the greatest likelihood of matching the semantic information and the syntactic structure of the new sentence; and
examining the selected parse tree to recognize the fact.
-
-
4. A method in a data processing system, comprising the steps of:
-
receiving syntactic language examples, wherein receiving syntactic language examples includes receiving first syntactic trees reflecting syntactic structure for a first plurality of sentences and generating a syntactic model containing probabilities indicating a likelihood of the syntactic structure for each of the first plurality of sentences;
receiving semantic language examples, wherein receiving semantic language examples includes receiving a second plurality of sentences with semantic annotations and generating second syntactic trees reflecting syntactic structure of the second plurality of sentences by using the syntactic model;
creating a model from both the syntactic language examples and the semantic language examples, wherein creating a model includes augmenting the second syntactic trees for the second plurality of sentences to include semantic information derived from the semantic annotations, wherein the augmented syntactic trees have nodes and augmenting the second syntactic trees includes generating probabilities for each of the nodes of the augmented syntactic trees;
using the model to determine a meaning of a sequence of words, wherein determining the meaning of a sequence of words includes recognizing at least one fact in the sequence of words; and
storing the recognized fact in a data store.
-
-
5. A computer-readable medium containing instructions for controlling a data processing system to perform a method comprising the steps of:
-
receiving syntactic language examples, wherein receiving syntactic language examples includes receiving first syntactic trees reflecting syntactic structure for a first plurality of sentences and generating a syntactic model containing probabilities indicating a likelihood of the syntactic structure;
receiving semantic language examples, wherein receiving semantic language examples includes receiving a second plurality of sentences with semantic annotations and generating second syntactic trees reflecting syntactic structure of the second plurality of sentences by using the syntactic model;
creating a model from both the syntactic language examples and the semantic language examples, wherein creating a model includes augmenting the second syntactic trees for the second plurality of sentences to include semantic information derived from the semantic annotations, wherein the augmented syntactic trees have nodes and augmenting the second syntactic trees includes generating probabilities for each of the nodes of the augmented syntactic trees;
using the model to determine a meaning of a sequence of words, wherein determining the meaning of a sequence of words includes recognizing at least one fact in the sequence of words; and
storing the recognized fact into a data store.
-
Specification