Retrieving Text from a Corpus of Documents in an Information Handling System
First Claim
1. A method, in a question and answer (QA) system comprising a processor and a memory, for retrieving candidate answers from a corpus of documents, the method comprising:
- receiving, by the QA system, an input question for which an answer is sought;
extracting, by the QA system, features of the input question based on a natural language processing of the input question;
executing, by the QA system, a first search of the corpus of documents based on a first subset of the extracted features of the input question and an initial evaluation of a utility of the first subset of extracted features to generate a subset of documents matching the first subset of extracted features;
executing, by the QA system, a second search of a set of passages extracted from the subset of documents based on a second subset of the extracted features of the input question and a reevaluation of the utility of the second subset of extracted features thereby forming a subset of passages; and
generating, by the QA system, query results from the subset of passages from which a set of candidate answers for the input question are identified.
1 Assignment
0 Petitions
Accused Products
Abstract
A mechanism is provided for retrieving candidate answers from a corpus of documents. The mechanism receives an input question for which an answer is sought. The mechanism extracts features of the input question based on a natural language processing. The mechanism executes a first search of the corpus of documents based on a first subset of the extracted features of the input question and an initial evaluation of a utility of the first subset of extracted features to generate a subset of documents. The mechanism executes a second search of a set of passages extracted from the subset of documents based on a second subset of the extracted features of the input question and a reevaluation of the utility of the second subset of extracted features thereby forming a subset of passages. The mechanism generates query results from the subset of passages matching from which candidate answers are identified.
-
Citations
20 Claims
-
1. A method, in a question and answer (QA) system comprising a processor and a memory, for retrieving candidate answers from a corpus of documents, the method comprising:
-
receiving, by the QA system, an input question for which an answer is sought; extracting, by the QA system, features of the input question based on a natural language processing of the input question; executing, by the QA system, a first search of the corpus of documents based on a first subset of the extracted features of the input question and an initial evaluation of a utility of the first subset of extracted features to generate a subset of documents matching the first subset of extracted features; executing, by the QA system, a second search of a set of passages extracted from the subset of documents based on a second subset of the extracted features of the input question and a reevaluation of the utility of the second subset of extracted features thereby forming a subset of passages; and generating, by the QA system, query results from the subset of passages from which a set of candidate answers for the input question are identified. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer program product comprising a computer readable storage medium having a computer readable program stored therein, wherein the computer readable program, when executed on a computing device, causes the computing device to:
-
receive an input question for which an answer is sought; extract features of the input question based on a natural language processing of the input question; execute a first search of a corpus of documents based on a first subset of the extracted features of the input question and an initial evaluation of a utility of the first subset of extracted features to generate a subset of documents matching the first subset of extracted features; execute a second search of a set of passages extracted from the subset of documents based on a second subset of the extracted features of the input question and a reevaluation of the utility of the second subset of extracted features thereby forming a subset of passages; and generate query results from the subset of passages from which a set of candidate answers for the input question are identified. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. An apparatus comprising:
-
a processor; and a memory coupled to the processor, wherein the memory comprises instructions which, when executed by the processor, cause the processor to; receive an input question for which an answer is sought; extract features of the input question based on a natural language processing of the input question; execute a first search of a corpus of documents based on a first subset of the extracted features of the input question and an initial evaluation of a utility of the first subset of extracted features to generate a subset of documents matching the first subset of extracted features; execute a second search of a set of passages extracted from the subset of documents based on a second subset of the extracted features of the input question and a reevaluation of the utility of the second subset of extracted features forming a subset of passages; and generate query results from the subset of passages from which a set of candidate answers for the input question are identified. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification