COST-BENEFIT APPROACH TO AUTOMATICALLY COMPOSING ANSWERS TO QUESTIONS BY EXTRACTING INFORMATION FROM LARGE UNSTRUCTURED CORPORA
First Claim
1. A normalization system, comprising:
- a processor;
a memory communicatively coupled to the processor, the memory having stored therein computer-executable instructions to implement the system, including;
an interface component that processes questions posed by users, the questions corresponding to a heterogeneous knowledge base;
a dialog component that requests users to reformulate questions based upon a cost-benefit analysis;
a normalization component that applies a utility model that predicts accuracy results to provide a regularized understanding of the knowledge base based on at least one of the processed questions or the reformulated questions; and
an answer composer component that employs the predicted accuracy of results to generate answers to the processed questions.
2 Assignments
0 Petitions
Accused Products
Abstract
The present invention relates to a system and methodology to facilitate extraction of information from a large unstructured corpora such as from the World Wide Web and/or other unstructured sources. Information in the form of answers to questions can be automatically composed from such sources via probabilistic models and cost-benefit analyses to guide resource-intensive information-extraction procedures employed by a knowledge-based question answering system. The analyses can leverage predictions of the ultimate quality of answers generated by the system provided by Bayesian or other statistical models. Such predictions, when coupled with a utility model can provide the system with the ability to make decisions about the number of queries issued to a search engine (or engines), given the cost of queries and the expected value of query results in refining an ultimate answer. Given a preference model, information extraction actions can be taken with the highest expected utility. In this manner, the accuracy of answers to questions can be balanced with the cost of information extraction and analysis to compose the answers.
-
Citations
20 Claims
-
1. A normalization system, comprising:
-
a processor; a memory communicatively coupled to the processor, the memory having stored therein computer-executable instructions to implement the system, including; an interface component that processes questions posed by users, the questions corresponding to a heterogeneous knowledge base; a dialog component that requests users to reformulate questions based upon a cost-benefit analysis; a normalization component that applies a utility model that predicts accuracy results to provide a regularized understanding of the knowledge base based on at least one of the processed questions or the reformulated questions; and an answer composer component that employs the predicted accuracy of results to generate answers to the processed questions. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A method to normalize a database, comprising:
employing a processor executing computer executable instructions embodied on a computer readable storage medium to perform the following acts; receiving a question from a user automatically forming a set of queries from the question received from the user, each query is assigned a different weight; performing a cost-benefit analysis on the set of queries to generate a query subset, wherein the cost-benefit analysis factors cost of a quantity of queries to include in the query subset versus the accuracy of the results returned from the quantity of queries included in the query subset; executing the query subset on the database to provide a set of results; and providing an answer to the user based upon the set of results. - View Dependent Claims (16, 17, 18, 19)
-
20. A system to facilitate database normalization, comprising:
-
a processor; a memory communicatively coupled to the processor, the memory having stored therein computer-executable instructions to implement the system, including; means for receiving a question from a user means for automatically forming a set of queries from the question received from the user; means for generating query subset from the set of queries, wherein the means for generating the query subset dynamically determines a quantity of queries from the set of queries to include in a query subset based upon a cost-benefit analysis that factors cost of the quantity of queries to include in the query subset versus the accuracy of the results returned from the quantity of queries included in the query subset; means for executing the query subset on the database to provide a set of results; and means for providing an answer to the user based upon the set of results.
-
Specification