Lexical answer type confidence estimation and application
First Claim
Patent Images
1. A method for extracting features from a query comprising a text string, said method comprising:
- identifying a syntactic pattern rule associated with said query, said pattern rule identified from a set of pattern rules that define common lexical answer types (LATs), a first feature of said extracted features comprising an identified pattern rule;
checking for prior instances of a detected lexical answer type (LAT) and computing a LAT word frequency based on said prior instances, a second feature of said extracted features comprising a computed frequency of a query word being a candidate LAT;
obtaining a parse tree data structure associated with said query;
identifying grammatical relations amongst words associated with said candidate LAT in said parse tree structure, a third feature of said extracted features comprising a part of speech of the candidate LAT,determining whether the candidate LAT word co-references some other word in said query recognized as a LAT, a fourth feature of said extracted features comprising a co-reference information, andapplying a model to said extracted first, second, third and fourth features to produce a confidence value representing an estimated accuracy of a detected query LAT,wherein one or more programmed processor devices performs said identifying a syntactic pattern rule, checking for prior instances, obtaining the parse tree data structure, identifying grammatical relations, and determining LAT word co-references, and model applying.
1 Assignment
0 Petitions
Accused Products
Abstract
A system, method and computer program product for automatically estimating the confidence of a detected LAT to provide a more accurate overall score for an obtained candidate answer. A confidence “score” or value of each detected LAT is obtained, and the system and method performs combining the confidence score with a degree of match between a LAT and an AnswerType of the candidate answer to provide improved overall score for the candidate answer.
-
Citations
21 Claims
-
1. A method for extracting features from a query comprising a text string, said method comprising:
-
identifying a syntactic pattern rule associated with said query, said pattern rule identified from a set of pattern rules that define common lexical answer types (LATs), a first feature of said extracted features comprising an identified pattern rule; checking for prior instances of a detected lexical answer type (LAT) and computing a LAT word frequency based on said prior instances, a second feature of said extracted features comprising a computed frequency of a query word being a candidate LAT; obtaining a parse tree data structure associated with said query; identifying grammatical relations amongst words associated with said candidate LAT in said parse tree structure, a third feature of said extracted features comprising a part of speech of the candidate LAT, determining whether the candidate LAT word co-references some other word in said query recognized as a LAT, a fourth feature of said extracted features comprising a co-reference information, and applying a model to said extracted first, second, third and fourth features to produce a confidence value representing an estimated accuracy of a detected query LAT, wherein one or more programmed processor devices performs said identifying a syntactic pattern rule, checking for prior instances, obtaining the parse tree data structure, identifying grammatical relations, and determining LAT word co-references, and model applying. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for extracting features from a query comprising a text string, said system comprising:
-
a memory storage device; one or more processor devices, each in communication to said memory device and configured to perform a method to; identify a syntactic pattern rule associated with said query, said pattern rule identified from a set of pattern rules that define common lexical answer types (LATs), a first feature of said extracted features comprising an identified pattern rule; check for prior instances of a detected lexical answer type (LAT) and computing a LAT word frequency based on said prior instances, a second feature of said extracted features comprising a computed frequency of a query word being a candidate LAT; obtain a parse tree data structure associated with said query; identify grammatical relations amongst words associated with said candidate LAT in said parse tree structure, a third feature of said extracted features comprising a part of speech of the candidate LAT, determine whether the candidate LAT word co-references some other word in said query recognized as a LAT, a fourth feature of said extracted features comprising a co-reference information, and apply a model to said extracted first, second, third and fourth features to produce a confidence value representing an estimated accuracy of a detected query LAT. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A computer program product for automatically extracting features from a query comprising a text string, the computer program device comprising a storage medium readable by a processing circuit and storing instructions run by the processing circuit for performing a method, the method comprising:
-
identifying a syntactic pattern rule associated with said query, said pattern rule identified from a set of pattern rules that define common lexical answer types (LATs), a first feature of said extracted features comprising an identified pattern rule; checking for prior instances of a detected lexical answer type (LAT) and computing a LAT word frequency based on said prior instances, a second feature of said extracted features comprising a computed frequency of a query word being a candidate LAT; obtaining a parse tree data structure associated with said query; identifying grammatical relations amongst words associated with said candidate LAT in said parse tree structure, a third feature of said extracted features comprising a part of speech of the candidate LAT, determining whether the candidate LAT word co-references some other word in said query recognized as a LAT, a fourth feature of said extracted features comprising a co-reference information, and applying a model to said extracted first, second, third and fourth features to produce a confidence value representing an estimated accuracy of a detected query LAT. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification