Comparative web search system and method
First Claim
Patent Images
1. A method performing a comparative web search comprising:
- providing a meta-search engine in communication with a plurality of web-based search engines;
providing said meta-search engine with a query, a search mode, and a selected set of said web-based search engines, said meta-search engine using said query to search for documents on the selected set of said web-based search engines, wherein the search mode includes the following modes;
(a) a comparison of data sets collected from same or different generic web search engines in response to two or more different queries, (b) a comparison of data sets collected from same or different generic web search engines in response to a common query performed at different points of time, (c) a comparison of data sets collected from different generic web search engines in response to a common query performed simultaneously, (d) a comparison of data sets collected from same or different generic web search engines in response to a common query performed in different languages, (e) a comparison of result sets collected from same or different generic web search engines in response to a common query where intra-domain similarity between results of the same set is 100% while inter-geographic origin similarity between result sets is zero, (f) a comparison of result sets collected from same or different generic web search engines in response to a common query where intra-geographic origin similarity between results of same set is 100% while inter-geographic origin similarity between result sets is zero, and (g) a result set retrieved from a generic web search engine in response to a query is segmented into bins of equi-distant and equi-weighted segments and the segments are compared to generate comparative summaries;
retrieving automatically search results from each of the web-based search engines in the selected set in the form of at least web snippets or documents from each member of the selected set of said web-based search engines and using the search result as raw data;
providing automatically the raw data to a data pre-processing module which automatically removes stop words and HTML tags, and applies a stemming algorithm, resulting in pre-processed data;
providing automatically the pre-processed data to a comparison engine, said comparison engine performing an object level comparison or a thematic level comparison depending on which comparison is specified in the search mode, said comparison resulting in a plurality of result sets from the selected set of said web-based search engines;
determining automatically logical relationships between each of the plurality of result sets and providing a results comparison of the determined logical relationships;
organizing automatically the search results in ranked lists when the object level comparison is performed by the comparison engine and labeled hierarchical clusters when the thematic level comparison is performed by the comparison engine, said organizing resulting in organized search results; and
outputting the results comparison and the organized search results for viewing.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for a comparative web search engines, search result summarization, web snippet processing, comparison analysis, information visualization, meta-clustering, and quantitative evaluation of web snippet quality are disclosed. The present invention extends the capabilities of web searching and informational retrieval by providing a succinct comparative summary of search results at either the object or thematic levels.
25 Citations
19 Claims
-
1. A method performing a comparative web search comprising:
-
providing a meta-search engine in communication with a plurality of web-based search engines; providing said meta-search engine with a query, a search mode, and a selected set of said web-based search engines, said meta-search engine using said query to search for documents on the selected set of said web-based search engines, wherein the search mode includes the following modes;
(a) a comparison of data sets collected from same or different generic web search engines in response to two or more different queries, (b) a comparison of data sets collected from same or different generic web search engines in response to a common query performed at different points of time, (c) a comparison of data sets collected from different generic web search engines in response to a common query performed simultaneously, (d) a comparison of data sets collected from same or different generic web search engines in response to a common query performed in different languages, (e) a comparison of result sets collected from same or different generic web search engines in response to a common query where intra-domain similarity between results of the same set is 100% while inter-geographic origin similarity between result sets is zero, (f) a comparison of result sets collected from same or different generic web search engines in response to a common query where intra-geographic origin similarity between results of same set is 100% while inter-geographic origin similarity between result sets is zero, and (g) a result set retrieved from a generic web search engine in response to a query is segmented into bins of equi-distant and equi-weighted segments and the segments are compared to generate comparative summaries;retrieving automatically search results from each of the web-based search engines in the selected set in the form of at least web snippets or documents from each member of the selected set of said web-based search engines and using the search result as raw data; providing automatically the raw data to a data pre-processing module which automatically removes stop words and HTML tags, and applies a stemming algorithm, resulting in pre-processed data; providing automatically the pre-processed data to a comparison engine, said comparison engine performing an object level comparison or a thematic level comparison depending on which comparison is specified in the search mode, said comparison resulting in a plurality of result sets from the selected set of said web-based search engines; determining automatically logical relationships between each of the plurality of result sets and providing a results comparison of the determined logical relationships; organizing automatically the search results in ranked lists when the object level comparison is performed by the comparison engine and labeled hierarchical clusters when the thematic level comparison is performed by the comparison engine, said organizing resulting in organized search results; and outputting the results comparison and the organized search results for viewing. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A method performing a comparative web search comprising:
-
providing a meta-search engine in communication with a plurality of web-based search engines; providing said meta-search engine with a query, a search mode, and a selected set of said web-based search engines, said meta-search engine using said query to search for documents on the selected set of said web-based search engines; retrieving automatically search results from each of the web-based search engines in the selected set in the form of at least web snippets or documents from each member of the selected set of said web-based search engines and using the search result as raw data; providing automatically the raw data to a data pre-processing module which automatically removes stop words and HTML tags, and applies a stemming algorithm, resulting in pre-processed data; providing automatically the pre-processed data to a comparison engine, said comparison engine performing an object level comparison or a thematic level comparison depending on which comparison is specified in the search mode, said comparison resulting in a plurality of result sets from the selected set of said web-based search engines, wherein when the thematic level comparison is specified in the search mode, said method further comprises generating automatically a set of base clusters from the raw data collected from each member of the selected set of the web-based search engines, performing homogeneous clustering to identify overlap in keywords between two base clusters belonging to the same result set, and performing heterogeneous clustering to identify overlap in keywords between two clusters belonging to different result sets; determining automatically logical relationships between each of the plurality of result sets and providing a results comparison of the determined logical relationships; organizing automatically the search results in ranked lists when the object level comparison is performed by the comparison engine and labeled hierarchical clusters when the thematic level comparison is performed by the comparison engine, said organizing resulting in organized search results; and outputting the results comparison and the organized search results for viewing.
-
-
18. A method performing a comparative web search comprising:
-
providing a meta-search engine in communication with a plurality of web-based search engines; providing said meta-search engine with a query, a search mode, and a selected set of said web-based search engines, said meta-search engine using said query to search for documents on the selected set of said web-based search engines; retrieving automatically search results from each of the web-based search engines in the selected set in the form of at least web snippets or documents from each member of the selected set of said web-based search engines and using the search result as raw data; providing automatically the raw data to a data pre-processing module which automatically removes stop words and HTML tags, and applies a stemming algorithm, resulting in pre-processed data; providing automatically the pre-processed data to a comparison engine, said comparison engine performing an object level comparison or a thematic level comparison depending on which comparison is specified in the search mode, said comparison resulting in a plurality of result sets from the selected set of said web-based search engines; determining automatically logical relationships between each of the plurality of result sets and providing a results comparison of the determined logical relationships; organizing automatically the search results in ranked lists when the object level comparison is performed by the comparison engine and labeled hierarchical clusters when the thematic level comparison is performed by the comparison engine, said organizing resulting in organized search results; finding at least two clusters with maximum overlap between their associated keywords and merging the keywords if one of the at least two clusters is a subset of ones of the at least two clusters, and wherein if there is a tie between the at least two clusters in regards to sharing the same number of overlapping keywords, then employing a tie breaking technique to break the tie therebetween for determining which keyword terms to use for cluster labeling of the at least two base clusters with maximum overlapping, said tie breaking technique comprises comparing frequency scores of each of the at least two base clusters, wherein a highest frequency score breaks the tie, and if the frequency scores are unable to break the tie, then checking if the keywords of each of the at least two clusters is a subset of at least one cluster'"'"'s keywords and if still no resolution to the tie, skipping the merging step and labeling each of the at least two clusters with the associated keywords; and outputting the results comparison and the organized search results for viewing.
-
-
19. A method performing a comparative search comprising:
-
providing a search engine in communication with a plurality of information sources; providing said search engine with a query, a search mode, and a selected set of said information sources, said search engine using said query to search for documents on the selected set of said information sources, wherein the search mode includes the following modes;
(a) a comparison of data sets collected from same or different information sources in response to two or more different queries, (b) a comparison of data sets collected from same or different information sources in response to a common query performed at different points of time, (c) a comparison of data sets collected from different information sources in response to a common query performed simultaneously, (d) a comparison of data sets collected from same or different information sources in response to a common query performed in different languages, (e) a comparison of result sets collected from same or different information sources in response to a common query where intra-domain similarity between results of the same set is 100% while inter-geographic origin similarity between result sets is zero, (f) a comparison of result sets collected from same or different information sources in response to a common query where intra-geographic origin similarity between results of same set is 100% while inter-geographic origin similarity between result sets is zero, and (g) a result set retrieved from an information source in response to a query is segmented into bins of equi-distant and equi-weighted segments and the segments are compared to generate comparative summaries;retrieving automatically search results from each of the information sources in the selected set in the form of at least snippets or documents from each member of the selected set of said information sources and using the search result as raw data; providing automatically the raw data to a data pre-processing module which automatically removes stop words and HTML tags, and applies a stemming algorithm, resulting in pre-processed data; providing automatically the pre-processed data to a comparison engine, said comparison engine performing an object level comparison or a thematic level comparison depending on which comparison is specified in the search mode, said comparison resulting in a plurality of result sets from the selected set of said information sources; determining automatically logical relationships between each of the plurality of result sets and providing a results comparison of the determined logical relationships; organizing automatically the search results in ranked lists when the object level comparison is performed by the comparison engine and labeled hierarchical clusters when the thematic level comparison is performed by the comparison engine, said organizing resulting in organized search results; and outputting the results comparison and the organized search results for viewing.
-
Specification