Justifying passage machine learning for question and answer systems

US 9,613,317 B2
Filed: 03/28/2014
Issued: 04/04/2017
Est. Priority Date: 03/29/2013
Status: Expired due to Fees

First Claim

Patent Images

1. A method, in a data processing system comprising a processor and a memory configured to implement a question and answer system (QA), for generating answers to an input question, comprising:

training a justifying passage model (JPM) based on a JPM ground truth data structure that comprises a justification indicator for each question-answer-evidence passage (QAP) triplet in a plurality of QAP triplets of the JPM ground truth data structure, wherein the justification indicator indicates whether or not an answer in the QAP triplet is justified by the evidence passage of the QAP triplet as being a correct answer for a question in the QAP triplet, and wherein an answer of the QAP triplet is justified by the evidence passage of the QAP triplet when content of the evidence passage explicitly states the answer to be a correct answer for the question of the QAP triplet;

receiving, in the data processing system, the input question;

generating, by the data processing system, a set of candidate answers for the input question and, for each candidate answer in the set of candidate answers, a corresponding selection of one or more selected evidence portions from a corpus of information providing evidence in support of the candidate answer being a correct answer for the input question;

ranking, by the data processing system, the candidate answers based on an application of the trained JPM to the selected evidence portions for each of the candidate answers in the set of candidate answers, wherein the JPM identifies whether a candidate answer is justified by a selected evidence portion corresponding to the candidate answer, and wherein application of the trained JPM to the selected evidence portions causes the ranking of the candidate answers to be modified based on whether or not a selected evidence portion is justifying of a corresponding candidate answer; and

outputting, by the data processing system, a candidate answer in the set of candidate answers as the correct answer for the input question based on the ranking of the candidate answers.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Mechanisms are provided for generating an answer to an input question. An input question is received and a set of candidate answers is generated along with, for each candidate answer in the set of candidate answers, a corresponding selection of one or more selected evidence portions from a corpus of information providing evidence in support of the candidate answer being a correct answer for the input question. The candidate answers are ranked based on an application of a justifying passage model (JPM) to the selected evidence portions for each of the candidate answers in the set of candidate answers. The JPM identifies whether a candidate answer is justified by a selected evidence passage corresponding to the candidate answer. A candidate answer is output as the correct answer for the input question based on the ranking of the candidate answers.

35 Citations

View as Search Results

21 Claims

1. A method, in a data processing system comprising a processor and a memory configured to implement a question and answer system (QA), for generating answers to an input question, comprising:
- training a justifying passage model (JPM) based on a JPM ground truth data structure that comprises a justification indicator for each question-answer-evidence passage (QAP) triplet in a plurality of QAP triplets of the JPM ground truth data structure, wherein the justification indicator indicates whether or not an answer in the QAP triplet is justified by the evidence passage of the QAP triplet as being a correct answer for a question in the QAP triplet, and wherein an answer of the QAP triplet is justified by the evidence passage of the QAP triplet when content of the evidence passage explicitly states the answer to be a correct answer for the question of the QAP triplet;
  
  receiving, in the data processing system, the input question;
  
  generating, by the data processing system, a set of candidate answers for the input question and, for each candidate answer in the set of candidate answers, a corresponding selection of one or more selected evidence portions from a corpus of information providing evidence in support of the candidate answer being a correct answer for the input question;
  
  ranking, by the data processing system, the candidate answers based on an application of the trained JPM to the selected evidence portions for each of the candidate answers in the set of candidate answers, wherein the JPM identifies whether a candidate answer is justified by a selected evidence portion corresponding to the candidate answer, and wherein application of the trained JPM to the selected evidence portions causes the ranking of the candidate answers to be modified based on whether or not a selected evidence portion is justifying of a corresponding candidate answer; and
  
  outputting, by the data processing system, a candidate answer in the set of candidate answers as the correct answer for the input question based on the ranking of the candidate answers.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 20)
- - 2. The method of claim 1, wherein the justification indicator for each QAP triplet in the plurality of QAP triplets is set to a justification value by a subject matter expert that evaluates the QAP triplet.
  - 3. The method of claim 1, wherein application of the JPM to the selected evidence portions comprises applying one or more weights of the JPM to evidence scores associated with the selected evidence portions to thereby generate a confidence score associated with the candidate answer, wherein the one or more weights of the JPM indicate a degree of relevance of the selected evidence portions to the input question.
  - 4. The method of claim 1, wherein application of the JPM to the selected evidence portions comprises:
    - ranking evidence portions in the selected evidence portions according to evidence scores generated based on the application of the JPM to the selected evidence portions; and
      
      selecting a subset of the ranked evidence portions in the selected evidence portions, for use in ranking the candidate answers, based on one or more filter criteria.
  - 5. The method of claim 1, wherein application of the JPM to the selected evidence portions comprises, for each candidate answer in the set of candidate answers:
    - weighting an evidence score associated with an evidence portion associated with the candidate answer based on a weight value, in the JPM, associated with the evidence portion; and
      
      calculating a weighted confidence score for the candidate answer based on a combination of weighted evidence scores for evidence portions associated with the candidate answer.
  - 6. The method of claim 1, wherein application of the JPM to the selected evidence portions comprises, for each candidate answer:
    - performing first context dependent scoring of the evidence portions corresponding to the candidate answer, wherein the first context dependent scoring scores the candidate answer based on a correlation between the candidate answer and the evidence portions corresponding to the candidate answer; and
      
      performing context independent scoring of the candidate answer, wherein the context independent scoring of the candidate answer scores the candidate answer independently of the evidence portions corresponding to the candidate answer.
  - 7. The method of claim 6, wherein application of the JPM to the selected evidence portions comprises:
    - weighting, based on the JPM, context independent scores associated with the candidate answers to generate weighted context independent scores;
      
      filtering, based on the weighted context independent scores of the candidate answers, candidate answers from the set of candidate answers to generate a subset of candidate answers; and
      
      performing, for each candidate answer in the subset of candidate answers, additional evidence portion retrieval operations for retrieving additional evidence portions from the corpus that support the candidate answer as being a correct answer for the input question.
  - 8. The method of claim 7, wherein application of the JPM to the selected evidence portions further comprises:
    - performing second context dependent scoring on the additional evidence portions retrieved as a result of the additional evidence portion retrieval operations;
      
      combining results of the first context dependent scoring and the second context dependent scoring; and
      
      generating a final ranking of candidate answers based on the combined results of the first context dependent scoring and the second context dependent scoring.
  - 9. The method of claim 6, wherein performing the first context dependent scoring and performing the context independent scoring are performed in parallel by parallel QA system pipeline paths on candidate answers generated by the QA system pipeline.
  - 20. The method of claim 1, wherein ranking the candidate answers based on the application of the trained JPM to the selected evidence portions for each of the candidate answers comprises, for each candidate answer:
    - applying the justifying passage model (JPM) to each selected evidence portion associated with the candidate answer to generate an evidence score for each selected evidence portion, wherein the JPM identifies whether the candidate answer is justified by the selected evidence portion and modifies the evidence score of the selected evidence portion according to whether or not the candidate answer is justified by the selected evidence portion; and
      
      generating a ranked listing of the selected evidence portions for the candidate answer by ranking the selected evidence portions according to evidence scores associated with each of the selected evidence portions; and
      
      selecting a subset of the selected evidence portions for use in ranking the candidate answers, by applying one or more filter criteria to the ranked listing of selected evidence portions.

10. A computer program product comprising a computer readable storage medium having a computer readable program stored therein, wherein the computer readable program, when executed on a data processing system implementing a question and answer system (QA), causes the data processing system to:
- train a justifying passage model (JPM) based on a JPM ground truth data structure that comprises a justification indicator for each question-answer-evidence passage (QAP) triplet in a plurality of QAP triplets of the JPM ground truth data structure, wherein the justification indicator indicates whether or not an answer in the QAP triplet is justified by the evidence passage of the QAP triplet as being a correct answer for a question in the QAP triplet, and wherein an answer of the QAP triplet is justified by the evidence passage of the QAP triplet when content of the evidence passage explicitly states the answer to be a correct answer for the question of the QAP triplet;
  
  receive the input question;
  
  generate a set of candidate answers for the input question and, for each candidate answer in the set of candidate answers, a corresponding selection of one or more selected evidence portions from a corpus of information providing evidence in support of the candidate answer being a correct answer for the input question;
  
  rank the candidate answers based on an application of the trained JPM to the selected evidence portions for each of the candidate answers in the set of candidate answers, wherein the JPM identifies whether a candidate answer is justified by a selected evidence portion corresponding to the candidate answer, and wherein application of the trained JPM to the selected evidence portions causes the ranking of the candidate answers to be modified based on whether or not a selected evidence portion is justifying of a corresponding candidate answer; and
  
  output a candidate answer in the set of candidate answers as the correct answer for the input question based on the ranking of the candidate answers.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 21)
- - 11. The computer program product of claim 10, wherein the justification indicator for each QAP triplet in the plurality of QAP triplets is set to a justification value by a subject matter expert that evaluates the QAP triplet.
  - 12. The computer program product of claim 10, wherein the computer readable program further causes the data processing system to apply the JPM to the selected evidence portions at least by applying one or more weights of the JPM to evidence scores associated with the selected evidence portions to thereby generate a confidence score associated with the candidate answer, wherein the one or more weights of the JPM indicate a degree of relevance of the selected evidence portions to the input question.
  - 13. The computer program product of claim 10, wherein the computer readable program further causes the data processing system to apply the JPM to the selected evidence portions at least by:
    - ranking evidence portions in the selected evidence portions according to evidence scores generated based on the application of the JPM to the selected evidence portions; and
      
      selecting a subset of the ranked evidence portions in the selected evidence portions, for use in ranking the candidate answers, based on one or more filter criteria.
  - 14. The computer program product of claim 10, wherein the computer readable program further causes the data processing system to apply the JPM to the selected evidence portions at least by, for each candidate answer in the set of candidate answers:
    - weighting an evidence score associated with an evidence portion associated with the candidate answer based on a weight value, in the JPM, associated with the evidence portion; and
      
      calculating a weighted confidence score for the candidate answer based on a combination of weighted evidence scores for evidence portions associated with the candidate answer.
  - 15. The computer program product of claim 10, wherein the computer readable program further causes the data processing system to apply the JPM to the selected evidence portions at least by, for each candidate answer:
    - performing first context dependent scoring of the evidence portions corresponding to the candidate answer, wherein the first context dependent scoring scores the candidate answer based on a correlation between the candidate answer and the evidence portions corresponding to the candidate answer; and
      
      performing context independent scoring of the candidate answer, wherein the context independent scoring of the candidate answer scores the candidate answer independently of the evidence portions corresponding to the candidate answer.
  - 16. The computer program product of claim 15, wherein the computer readable program further causes the data processing system to apply the JPM to the selected evidence portions at least by:
    - weighting, based on the JPM, context independent scores associated with the candidate answers to generate weighted context independent scores;
      
      filtering, based on the weighted context independent scores of the candidate answers, candidate answers from the set of candidate answers to generate a subset of candidate answers; and
      
      performing, for each candidate answer in the subset of candidate answers, additional evidence portion retrieval operations for retrieving additional evidence portions from the corpus that support the candidate answer as being a correct answer for the input question.
  - 17. The computer program product of claim 16, wherein the computer readable program further causes the data processing system to apply the JPM to the selected evidence portions further at least by:
    - performing second context dependent scoring on the additional evidence portions retrieved as a result of the additional evidence portion retrieval operations;
      
      combining results of the first context dependent scoring and the second context dependent scoring; and
      
      generating a final ranking of candidate answers based on the combined results of the first context dependent scoring and the second context dependent scoring.
  - 21. The computer program product of claim 10, wherein ranking the candidate answers based on the application of the trained JPM to the selected evidence portions for each of the candidate answers comprises, for each candidate answer:
    - applying the justifying passage model (JPM) to each selected evidence portion associated with the candidate answer to generate an evidence score for each selected evidence portion, wherein the JPM identifies whether the candidate answer is justified by the selected evidence portion and modifies the evidence score of the selected evidence portion according to whether or not the candidate answer is justified by the selected evidence portion; and
      
      generating a ranked listing of the selected evidence portions for the candidate answer by ranking the selected evidence portions according to evidence scores associated with each of the selected evidence portions; and
      
      selecting a subset of the selected evidence portions for use in ranking the candidate answers, by applying one or more filter criteria to the ranked listing of selected evidence portions.

18. A data processing system configured to implement a question and answer system (QA), comprising:
- a processor; and
  
  a memory coupled to the processor, wherein the memory comprises instructions which, when executed by the processor, cause the processor to;
  
  train a justifying passage model (JPM) based on a JPM ground truth data structure that comprises a justification indicator for each question-answer-evidence passage (QAP) triplet in a plurality of QAP triplets of the JPM ground truth data structure, wherein the justification indicator indicates whether or not an answer in the QAP triplet is justified by the evidence passage of the QAP triplet as being a correct answer for a question in the QAP triplet, and wherein an answer of the QAP triplet is justified by the evidence passage of the QAP triplet when content of the evidence passage explicitly states the answer to be a correct answer for the question of the QAP triplet;
  
  receive the input question;
  
  generate a set of candidate answers for the input question and, for each candidate answer in the set of candidate answers, a corresponding selection of one or more selected evidence portions from a corpus of information providing evidence in support of the candidate answer being a correct answer for the input question;
  
  rank the candidate answers based on an application of the trained JPM to the selected evidence portions for each of the candidate answers in the set of candidate answers, wherein the JPM identifies whether a candidate answer is justified by a selected evidence portion corresponding to the candidate answer, and wherein application of the trained JPM to the selected evidence portions causes the ranking of the candidate answers to be modified based on whether or not a selected evidence portion is justifying of a corresponding candidate answer; and
  
  output a candidate answer in the set of candidate answers as the correct answer for the input question based on the ranking of the candidate answers.

19. A method, in a data processing system comprising a processor and a memory configured to implement a question and answer system (QA), for generating answers to an input question, comprising:
- receiving, in the data processing system, the input question;
  
  generating, by the data processing system, a set of candidate answers for the input question and, for each candidate answer in the set of candidate answers, a corresponding selection of one or more selected evidence portions from a corpus of information providing evidence in support of the candidate answer being a correct answer for the input question;
  
  for each candidate answer in the set of candidate answers;
  
  applying a justifying passage model (JPM) to selected evidence portions associated with the candidate answer to generate an evidence score for each selected evidence portion, wherein the JPM identifies whether a candidate answer is justified by a selected evidence passage corresponding to the candidate answer and modifies the evidence score according to whether or not the candidate answer is justified by the selected evidence passage, and wherein the candidate answer is justified by the selected evidence portion when content of the selected evidence portion explicitly states the candidate answer to be a correct answer for the input question;
  
  selecting a subset of the selected evidence portions for use in ranking the candidate answers, based on one or more filter criteria; and
  
  ranking the candidate answer relative to other candidate answers in the set of candidate answers based on the selected subset of the ranked evidence portions; and
  
  outputting, by the data processing system, a candidate answer in the set of candidate answers as the correct answer for the input question based on the ranking of the candidate answers.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Beamon, Bridget B., Whitley, Michael D., Yates, Robert L.
Primary Examiner(s)
IPAKCHI, MARYAM M

Application Number

US14/228,830
Publication Number

US 20140297571A1
Time in Patent Office

1,103 Days
Field of Search

None
US Class Current
CPC Class Codes

G06F 16/313   Selection or weighting of t...

G06F 16/61   Indexing; Data structures t...

G06N 20/00   Machine learning

G06N 5/022   Knowledge engineering; Know...

Justifying passage machine learning for question and answer systems

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

35 Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

Justifying passage machine learning for question and answer systems

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

35 Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links