System and method for improving answer relevance in meta-search engines
First Claim
1. A method for improving search results from a meta-search engine that queries information sources containing document collections, comprising:
- receiving an original query with user selected keywords and user selected operators;
the user selected operators defining relationships between the user selected keywords;
identifying a set of information sources to be interrogated using the original query by performing one of;
(a) receiving a set of user selected information sources, (b) automatically identifying a set of information sources, and (c) performing a combination of (a) and (b);
the set of information sources identifying two or more information sources;
translating at least one of the user selected operators of the original query that is not supported by one of the information sources in the set of information sources to an alternate operator that is supported by the one of the information sources in the set of information sources;
submitting a selected one of the translated queries and the original query to each information source in the set of information sources;
receiving answers from each information source for the query submitted thereto;
filtering each set of answers received from each information source that satisfy one of the translated queries by removing the answers that do not satisfy the original query;
for each filtered set of answers, computing a subsumption ratio of the number of filtered answers that satisfy the original query to the number of answers that satisfy the translated query; and
using each computed subsumption ratio to perform one of;
(d) reformulating a translated query;
(e) modifying information sources in the set of information sources automatically identified at (b); and
(f) performing a combination of (d) and (e).
8 Assignments
0 Petitions
Accused Products
Abstract
A user of a meta-search engine submits a query formulated with operators defining relationships between keywords. Information sources are selected for interrogation by the user or by the meta-search engine. If necessary, the query is translated for each selected source to adapt the operators of the query to a form accepted by that source. The query is submitted to each selected source and answers are retrieved from each source as a summary of each document found that satisfies the query. The answers are post-filtered from each source to determine if the answers satisfy the originally formulated query. Answers that satisfy the query are displayed as a list of selectable document summaries. The analysis includes computing a subsumption ratio of filtered answers to answers received that satisfy a translated query. The subsumption ratio is used to improve the accuracy of subsequent queries submitted by the user to the meta-search engine.
178 Citations
21 Claims
-
1. A method for improving search results from a meta-search engine that queries information sources containing document collections, comprising:
-
receiving an original query with user selected keywords and user selected operators;
the user selected operators defining relationships between the user selected keywords;
identifying a set of information sources to be interrogated using the original query by performing one of;
(a) receiving a set of user selected information sources, (b) automatically identifying a set of information sources, and (c) performing a combination of (a) and (b);
the set of information sources identifying two or more information sources;
translating at least one of the user selected operators of the original query that is not supported by one of the information sources in the set of information sources to an alternate operator that is supported by the one of the information sources in the set of information sources;
submitting a selected one of the translated queries and the original query to each information source in the set of information sources;
receiving answers from each information source for the query submitted thereto;
filtering each set of answers received from each information source that satisfy one of the translated queries by removing the answers that do not satisfy the original query;
for each filtered set of answers, computing a subsumption ratio of the number of filtered answers that satisfy the original query to the number of answers that satisfy the translated query; and
using each computed subsumption ratio to perform one of;
(d) reformulating a translated query;
(e) modifying information sources in the set of information sources automatically identified at (b); and
(f) performing a combination of (d) and (e).- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
providing to the user the answers from the set of information sources as a list of selectable document summaries;
requesting the user to provide positive feedback and negative feedback to documents in the list of selectable document summaries;
receiving feedback for at least one document in the list of document summaries;
computing document relevance for documents in the list of document summaries by performing one of;
(g) decreasing the relevance of a document with terms forming part of documents given negative feedback by the user that do not appear in documents given positive feedback by the user;
(h) increasing the relevance of a document with terms that form part of documents given positive feedback by the user; and
(i) a combination of (g) and (h); and
providing to the user the answers ranked according to the computed document relevance.
-
-
3. The method according to claim 2, further comprising:
-
refining at least one of the translated queries by performing one of;
(j) eliminating terms forming part of documents given negative feedback by the user that do not appear in documents given positive feedback by the user;
(k) adding terms not in the query that form part of documents given positive feedback by the user; and
(l) a combination of (j) and (k);
proposing to the user the refined query.
-
-
4. The method according to claim 1, further comprising:
-
providing to the user the answers from the set of information sources as a list of selectable document summaries;
requesting the user to provide positive feedback and negative feedback to documents in the list of selectable document summaries;
receiving feedback for at least one document in the list of document summaries;
refining at least one of the translated queries by performing one of;
(g) eliminating terms forming part of documents given negative feedback by the user that do not appear in documents given positive feedback by the user;
(h) adding terms not in the query that form part of documents given positive feedback by the user; and
(i) a combination of (g) and (h); and
proposing to the user the refined query.
-
-
5. The method according to claim 1, wherein said translating further comprises substituting an operator O that is not supported, by an information source in the set of information sources, with at least one supported operator O′
- that provides substantially similar answers as the operator O.
-
6. The method according to claim 5, wherein said computing computes an operator subsumption ratio r(O,O′
- ) of the number of filtered answers S and the number of answers S′
received that satisfy the translated query as follows;
- ) of the number of filtered answers S and the number of answers S′
-
7. The method according to claim 5, wherein said computing computes a query subsumption ratio r(Q) of the number of filtered answers S and the number of answers received S′
- that satisfy the translated query as follows;
r(Q)=|S|/|S′
|,wherein the translated query is reformulated when the query subsumption ratio r(Q) is less than a predetermined threshold value, where Q is a query that contains a set of operators O1, O2, . . . On.
- that satisfy the translated query as follows;
-
8. The method according to claim 1, further comprising:
-
providing the user the answers as a list of selectable document summaries in response to receiving the original query;
analyzing input received from the user in response to the providing the list of selectable document summaries;
said analyzing detecting if the user selects a document, unselects a document, or does not select or unselect a document;
assigning a positive weight to terms in selected documents and a negative weight to terms in unselected documents;
ranking the result by ordering the most relevant documents with the highest positive weight to the least relevant documents with the highest negative weight; and
providing the user the ranked list of results.
-
-
9. The method according to claim 8, further comprises:
-
disregarding all documents from the list that have not been selected by the user;
refining a translated query by eliminating terms from unselected documents which do not appear in selected documents and by adding terms that do appear in selected documents; and
proposing the refined query to the user.
-
-
10. The method according to claim 9, wherein the method further comprises:
-
determining a term relevance list (L) from selected and unselected documents;
determining a source relevance ratio by comparing the documents provided by an interrogated source with the term relevance list (L); and
eliminating those sources with a relevance ratio below a predetermined threshold value.
-
-
11. The method according to claim 10, wherein said proposing the refined query is performed by proposing one of a meta-search query and a set of independent queries to each source.
-
12. The method according to claim 11, the method further comprises for information sources that use a Vector-Space Model:
-
determining a first distance for each answer in the answer list with respect to the query;
the query being represented as a point in a multi-dimensional space;
eliminating those documents which have a distance that is greater than a predetermined distance to the query point; and
refining the query by considering terms those documents which have a distance that is less than the predetermined distance to the query point.
-
-
13. The method according to claim 11, the method further comprises for information sources that use an Enhanced Boolean Model:
adding at least one additional query term to the original query in order to make the query more selective;
the additional terms being identified using the input received from the user for documents in the list of selectable document summaries.
-
14. The method according to claim 8, further comprising:
-
analyzing user actions on the documents in the list of selectable document summaries;
recording information about the user and about the user'"'"'s interaction with the list of selectable document summaries;
computing correlation information between a previous query and a refined query relative to the information recorded about the user; and
proposing refinements in a successive query using the correlation information.
-
-
15. A meta-search engine, comprising:
-
an automatic source selection module for receiving an original query with user selected keywords and user selected operators;
the user selected operators defining relationships between the user selected keywords;
the automatic source selection module identifying a set of information sources to be interrogated using the original query by performing one of;
(a) receiving a set of user selected information sources, (b) automatically identifying a set of information sources, and (c) performing a combination of (a) and (b);
the set of information sources identifying two or more information sources;
a query translation module for translating at least one of the user selected operators of the original query that is not supported by one of the information sources in the set of information sources to an alternate operator that is supported by the one of the information sources in the set of information sources;
the query translation module submitting a selected one of the translated queries and the original query to each information source in the set of information sources,a query filtering module for receiving answers from each information source for the query submitted thereto by the query translation module;
the query filtering module filtering each set of answers received from each information source that satisfy one of the translated queries by removing the answers that do not satisfy the original query;
a silent query analyzer for computing a subsumption ratio for each set of answers filtered by the query filtering module;
the subsumption ratio being a ratio of the number of filtered answers that satisfy the original query to the number of answers that satisfy the translated query; and
a query reformulation module for using each subsumption ratio computed by the silent query analyzer to perform one of;
(d) reformulating a translated query;
(e) modifying information sources in the set of information sources automatically identified by the automatic source selection module; and
(f) performing a combination of (d) and (e).- View Dependent Claims (16, 17, 18, 19, 20)
a result ranking module for providing to the user the answers from the set of information sources as a list of selectable document summaries;
a feedback module for requesting the user to provide positive feedback and negative feedback to documents in the list of selectable document summaries provided by the result ranking module;
a schema evaluation module for receiving feedback for at least one document in the list of document summaries provided by the result ranking module;
the schema evaluation module computing document relevance for documents in the list of document summaries by performing one of;
(g) decreasing the relevance of a document with terms forming part of documents given negative feedback by the user that do not appear in documents given positive feedback by the user;
(h) increasing the relevance of a document with terms that form part of documents given positive feedback by the user; and
(i) a combination of (g) and (h); and
where the result ranking module provides to the user the answers ranked according to the computed document relevance.
-
-
17. The meta-search engine according to claim 16, further comprising a query history module for learning reformulations made to the query;
- wherein the reformations made to the query are used by the query reformation module to propose additional reformation to the query.
-
18. The meta-search engine according to claim 15, wherein the query reformation module substitutes an operator O that is not supported, by an information source in the set of information sources, with at least one supported operator O′
- that provides substantially similar answers as the operator O.
-
19. The meta-search engine according to claim 18, the silent query analyzer computes an operator subsumption ratio r(O,O′
- ) of the number of filtered answers S and the number of answers received that satisfy the translated query S′
as follows;
- ) of the number of filtered answers S and the number of answers received that satisfy the translated query S′
-
20. The meta-search engine according to claim 18, the silent query analyzer computes a query subsumption ratio r(Q) of the number of filtered answers S and the number of answers received that satisfy the translated query S′
- as follows;
- as follows;
-
21. An article of manufacture for use in a machine comprising:
-
a) a memory;
b) instructions stored in the memory for a method for improving search results from a meta-search engine that queries information sources containing document collections, the instructions being machine readable, the method comprising;
receiving an original query with user selected keywords and user selected operators;
the user selected operators defining relationships between the user selected keywords;
identifying a set of information sources to be interrogated using the original query by performing one of;
(a) receiving a set of user selected information sources, (b) automatically identifying a set of information sources, and (c) performing a combination of (a) and (b);
the set of information sources identifying two or more information sources;
translating at least one of the user selected operators of the original query that is not supported by one of the information sources in the set of information sources to an alternate operator that is supported by the one of the information sources in the set of information sources;
submitting a selected one of the translated queries and the original query to each information source in the set of information sources;
receiving answers from each information source for the query submitted thereto;
filtering each set of answers received from each information source that satisfy one of the translated queries by removing the answers that do not satisfy the original query;
for each filtered set of answers, computing a subsumption ratio of the number of filtered answers that satisfy the original query to the number of answers that satisfy the translated query; and
using each computed subsumption ratio to perform one of;
(d) reformulating a translated query;
(e) modifying information sources in the set of information sources automatically identified at (b); and
(f) performing a combination of (d) and (e).
-
Specification