×

System and method for domain adaptation in question answering

  • US 9,240,128 B2
  • Filed: 09/24/2011
  • Issued: 01/19/2016
  • Est. Priority Date: 05/14/2008
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for providing adaptation to a question answering system, wherein the question answering system has associated therewith a first corpus of data and a question-answer set, the question-answer set being a collection of questions and correct answers to these questions, such that each question has one or more correct answers associated with it, the method comprising the steps of:

  • submitting a set of questions to the question answering system;

    receiving back from the question answering system a set of answers generated in response to the set of questions, the set of answers that are received back being based upon at least one document in the first corpus of data;

    comparing the set of answers received back from the question answering system to answers in the question-answer set;

    identifying, based on the comparison of the set of answers received back from the question answering system to answers in the question-answer set, a plurality of answers from the set of answers received back from the question answering system that are not correct;

    generating a plurality of groups by performing automated grouping on at least one of;

    (a) a plurality of questions from the question-answer set that correspond to the identified answers that are not correct; and

    (b) a plurality of answers from the question-answer set that correspond to the identified answers that are not correct;

    creating a collection of related terms associated with the groups;

    obtaining, from a second corpus of data, textual information about each of the related terms, wherein the second corpus of data is external relative to the first corpus of data;

    creating a plurality of textual resources from the obtained information, each of the plurality of textual resources being associated with one of the related terms;

    scoring each of the plurality of textual resources based on whether each textual resource is informative with respect to the at least one document in the first corpus of data; and

    adding at least one of the created textual resources to the first corpus of data, wherein the at least one of the created textual resources that is added to the first corpus of data comprises a subset of all of the created plurality of textual resources and wherein the at least one of the created textual resources that is added to the first corpus of data had been scored as more informative with respect to the at least one document in the first corpus of data than at least one of the other created textual resources that is not added to the first corpus of data.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×