Utilizing information redundancy to improve text searches
First Claim
1. A system that facilitates data retrieval, comprising:
- a query component that executes a query to a first dataset; and
a projection component that executes the query across a second dataset, and analyzes properties of results of the query on the first dataset and results of the second dataset to generate a ranked result set of the query to the first dataset.
2 Assignments
0 Petitions
Accused Products
Abstract
Architecture for improving text searches using information redundancy. A search component is coupled with an analysis component to rerank documents returned in a search according to a redundancy values. Each returned document is used to develop a corresponding word probability distribution that is further used to rerank the returned documents according to the associated redundancy values. In another aspect thereof, the query component is coupled with a projection component to project answer redundancy from one document search to another. This includes obtaining the benefit of considerable answer redundancy from a second data source by projecting the success of the search of the second data source against a first data source.
-
Citations
20 Claims
-
1. A system that facilitates data retrieval, comprising:
-
a query component that executes a query to a first dataset; and
a projection component that executes the query across a second dataset, and analyzes properties of results of the query on the first dataset and results of the second dataset to generate a ranked result set of the query to the first dataset. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method of facilitating data retrieval, comprising:
-
receiving a query for processing by a search engine against a first dataset;
executing the query against the first dataset and a second dataset;
analyzing properties of results of the second dataset query against results of the first dataset query to determine information redundancy;
reranking the results of the first dataset query according to the information redundancy. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
-
18. The method of claim 18, the subset of the results of the second dataset query is determined based upon at least one of selecting the first one hundred results, selecting results based upon the success of the returned document including more than one of the search terms, selecting results based upon the inclusion of at least two key search terms of multiple search terms, selecting results based upon including a string of search terms in the required sequence, selecting results based upon including the search terms within a required spatial parameter, selecting results based upon properties of at least one of image content and audio contained therein, and selecting results based upon at least one of the number and type of hyperlinks to other websites.
-
19. A method of facilitating data retrieval, comprising:
-
processing a query against a plurality of documents;
measuring information redundancy of a returned document of a return set by determining an average pairwise information redundancy value between the returned document and the remaining documents of the return set; and
providing a ranked output of documents according to corresponding pairwise information redundancy values. - View Dependent Claims (20)
-
Specification