Scoring candidates using structural information in semi-structured documents for question answering systems
First Claim
1. A computer-implemented method for automatically scoring candidate answers to questions in a question and answer system comprising the steps of:
- receiving an input query string;
performing a query analysis upon said input query string to obtain query terms;
obtaining a candidate answer from at least one document in a data corpus using said query terms;
identifying one or more entity structures embedded in said at least one document;
extracting said one or more entity structures embedded in said at least one document, said embedded entity structures comprising user embedded tags or links to other documents;
determining a number of said entity structures having terms in said embedded tags or links that match query terms in the received input text query; and
,computing a confidence score for said candidate answer as a function of said number of entity structures having terms in said embedded tags or links that match query terms in the query string,wherein at least one of the steps of the method is performed by a processor device.
1 Assignment
0 Petitions
Accused Products
Abstract
A system, program product, and methodology automatically scores candidate answers to questions in a question and answer system. In the candidate answer scoring method, a processor device performs one or more of receiving one or more candidate answers associated with a query string, the candidates obtained from a data source having semi-structured content; identifying one or more documents with semi-structured content from the data source having a candidate answer; and for each identified document: extracting one or more entity structures embedded in the identified document; determining a number of the entity structures in the identified document that appear in the received input query; and, computing a score for a candidate answer in the document as a function of the number Overall system efficiency is improved by giving the correct candidate answers higher scores through leveraging context-dependent structural information such as links to other documents and embedded tags.
81 Citations
11 Claims
-
1. A computer-implemented method for automatically scoring candidate answers to questions in a question and answer system comprising the steps of:
-
receiving an input query string; performing a query analysis upon said input query string to obtain query terms; obtaining a candidate answer from at least one document in a data corpus using said query terms; identifying one or more entity structures embedded in said at least one document; extracting said one or more entity structures embedded in said at least one document, said embedded entity structures comprising user embedded tags or links to other documents; determining a number of said entity structures having terms in said embedded tags or links that match query terms in the received input text query; and
,computing a confidence score for said candidate answer as a function of said number of entity structures having terms in said embedded tags or links that match query terms in the query string, wherein at least one of the steps of the method is performed by a processor device. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer-implemented method for automatically scoring candidate answers to questions in a question and answer system comprising the steps of:
-
receiving one or more candidate answers associated with a query string, said candidates obtained from a data source having semi-structured content; identifying one or more documents with one or more entity structures embedded therein from said data source having a candidate answer; and
for each identified document;extracting one or more entity structures embedded in said identified document, said embedded entity structures comprising user embedded tags or links to other documents; determining a number of said entity structures having terms in said embedded tags or links that match query terms in the received input query; and
,computing a score for a candidate answer in said document as a function of said number of entity structures having terms in said embedded tags or links that match query terms in the query string, wherein at least one of the steps of the method is performed by a processor device. - View Dependent Claims (9, 10, 11)
-
Specification