QUESTIONS AND ANSWERS GENERATION
First Claim
1. A computer-implemented method for generating questions and answers pairs based on any corpus of data, said method comprising:
- generating, from a corpus of text data and a set of criteria, one ore more data structures;
generating, based on said set of criteria and one or more data structures, an initial set of questions;
retrieving a set of documents based on said initial set of questions;
generating from said documents, candidate question and answers;
conforming said set of candidate questions and answers to satisfy said set of criteria;
analyzing a quality of answers of said conformed set of questions and answers;
generating further one or more answers based on said analyzing; and
,outputting, based on said further one or more answers and said criteria, a final list question-answer (QA) pairs, wherein a program using a processor unit executes one or more of said generating, retrieving, generating, conforming, analyzing, generating and outputting.
1 Assignment
0 Petitions
Accused Products
Abstract
A system, method and/or computer program product for automatically generating questions and answers based on any corpus of data. The computer system, given a collection of textual documents, automatically generates collections of questions about the documents together with answers to those questions. In particular, such a process can be applied to the so called ‘open’ domain, where the type of the corpus is not given in advance, and neither is the ontology of the corpus. The system improves the exploring of large bodies of textual information. Applications implementing the system and method include new types of tutoring systems, educational question-answering games, national security and business analysis systems, etc.
-
Citations
27 Claims
-
1. A computer-implemented method for generating questions and answers pairs based on any corpus of data, said method comprising:
-
generating, from a corpus of text data and a set of criteria, one ore more data structures; generating, based on said set of criteria and one or more data structures, an initial set of questions; retrieving a set of documents based on said initial set of questions; generating from said documents, candidate question and answers; conforming said set of candidate questions and answers to satisfy said set of criteria; analyzing a quality of answers of said conformed set of questions and answers; generating further one or more answers based on said analyzing; and
,outputting, based on said further one or more answers and said criteria, a final list question-answer (QA) pairs, wherein a program using a processor unit executes one or more of said generating, retrieving, generating, conforming, analyzing, generating and outputting. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system for question-answer list generation comprising:
-
a memory device; and a processor connected to the memory device, wherein the processor performs step of; generating, from a corpus of text data and a set of criteria, one or more data structures; generating, based on said set of criteria and one or more data structures, an initial set of questions; retrieving a set of documents based on said initial set of questions; generating from said documents, candidate question and answers; conforming said set of candidate questions and answers to satisfy said set of criteria; analyzing a quality of answers of said conformed set of questions and answers; generating further one or more answers based on said analyzing; and
, outputting, based on said further one or more answers and said criteria, a final list question-answer (QA) pairs. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 22, 23, 24)
-
-
20. A computer program product for question-answer list generation, the computer program product comprising:
-
a computer readable storage medium having computer readable program code embodied therewith, the computer readable program code comprising; computer readable program code configured to generate, from a corpus of text data and a set of criteria, one or more data structures; computer readable program code configured to generate, based on said set of criteria and one or more data structures, an initial set of questions; computer readable program code configured to retrieve a set of documents based on said initial set of questions; computer readable program code configured to generate from said documents, candidate question and answers; computer readable program code configured to conform said set of candidate questions and answers to satisfy said set of criteria; computer readable program code configured to analyze a quality of answers of said conformed set of questions and answers; computer readable program code configured to generate further one or more answers based on said analyzing; and
,computer readable program code configured to output, based on said further one or more answers and said criteria, a final list question-answer (QA) pairs. - View Dependent Claims (21)
-
-
25. A question answering (QA) system comprising:
-
a memory device; and a processor connected to the memory device, wherein the processor performs step of; automatically preparing a list of question/answer pairs, each consisting of a question and an answer, said preparing comprising; providing a plurality of word or phrases based on a criteria; selecting, an entity from among said plurality of entity word or phrases; retrieving one or more documents including said entity; automatically creating a question by selecting a predicate in a document within which the entity appears, and successively adding additional predicates to ensure that the entity is uniquely determined by the predicate and any additional predicates; and
,setting the question to the predicate and the additional list of predicates retrieved, and setting the answer to the entity. - View Dependent Claims (26, 27)
-
Specification