Information search device and information search method using topic-centric query routing
First Claim
1. An information search device for selecting a search engine, comprising:
- a relevant term collector for collecting, from web pages having links pointing to topic search engines and from topic search engine pages, a relevant term that describes a topic and/or content that is handled by each search engine;
an index generator for producing a search engine selection index from said collected relevant terms;
a search engine selection index storage unit for storing said search engine selection index;
a query expansion unit for obtaining an expanded term that is relevant to a search keyword submitted by a user to a general purpose search engine;
an expanded term storage unit for storing the expanded term obtained by said query expansion unit; and
an engine selector for calculating goodness-of-fit between said search keyword and each search engine based on information that is stored in said search engine selection index storage unit and said expanded term storage unit, and for selecting, based on said goodness-of-fit, a search engine that is relevant to said search keyword, wherein said goodness-of-fit is calculated by where f(x,y) is a function that is 1 when character strings x and y are equal and 0 when character strings x and y are not equal, Ei are search engines, Gc are groups comprising search keywords associated with a topic, Wik are degrees of importance, Ccg are frequencies of occurrence, sik are relevant terms, and xcg are expanded terms.
1 Assignment
0 Petitions
Accused Products
Abstract
An information search device capable of selecting topic search engines (i.e., search engines that focus on specific topics) that are appropriate to a user'"'"'s search keywords when searching the Web on the Internet. Terms having relevance to each topic search engine are collected from, for example, the Web, and a DB selection index for selecting search engines is produced in advance by an index generator. When a search keyword is supplied from a user, terms having relevance to the search keyword are acquired from a general-purpose Web search engine by means of a query expansion unit. The thus-acquired terms are matched with terms stored in the DB selection index, and topic search engines having a high incidence of matching are presented to the user.
160 Citations
9 Claims
-
1. An information search device for selecting a search engine, comprising:
-
a relevant term collector for collecting, from web pages having links pointing to topic search engines and from topic search engine pages, a relevant term that describes a topic and/or content that is handled by each search engine;
an index generator for producing a search engine selection index from said collected relevant terms;
a search engine selection index storage unit for storing said search engine selection index;
a query expansion unit for obtaining an expanded term that is relevant to a search keyword submitted by a user to a general purpose search engine;
an expanded term storage unit for storing the expanded term obtained by said query expansion unit; and
an engine selector for calculating goodness-of-fit between said search keyword and each search engine based on information that is stored in said search engine selection index storage unit and said expanded term storage unit, and for selecting, based on said goodness-of-fit, a search engine that is relevant to said search keyword, wherein said goodness-of-fit is calculated by where f(x,y) is a function that is 1 when character strings x and y are equal and 0 when character strings x and y are not equal, Ei are search engines, Gc are groups comprising search keywords associated with a topic, Wik are degrees of importance, Ccg are frequencies of occurrence, sik are relevant terms, and xcg are expanded terms. - View Dependent Claims (2, 3, 4, 5, 6)
a reference character string storage unit for storing a character string in a document obtained as a search result from a general-purpose Web search engine when the search keyword submitted by the user is sent to said general-purpose Web search engine; and
a phrase generator for generating a phrase that explains a topic that is relevant to said search keyword based on information stored in said reference character string storage unit and said expanded term storage unit.
-
-
4. An information search device according to claim 1, further comprising:
-
a reference character string storage unit for storing a character string in a document obtained as a search result from a general-purpose Web search engine when the search keywords submitted by the user is sent to said general-purpose Web search engine; and
a phrase generator for generating a phrase that explains a topic that is relevant to said search keyword based on information stored in said reference character string storage unit and said expanded term storage unit.
-
-
5. An information search device according to claim 1, wherein said search engine selection index contains a relevant term for each search engine and a degree of importance for each relevant term, the degree of importance of each relevant term being determined according to the frequency of that relevant term.
-
6. An information search device according to claim 1, wherein said query expansion unit extracts a relevant term having a high degree of importance from relevant terms that are stored in said search engine selection index storage unit and acquires a term that is relevant to said search keyword by preferentially checking relevance between extracted terms and the search keyword that has been submitted by the user.
-
7. An information search method that presents to a user a topic search engine that is relevant to a search keyword that has been submitted by said user, comprising the steps of:
-
acquiring, for each topic search engine that is present on the Web, a term that is relevant to content of the topic search engine from a Web page of the topic search engine itself;
matching the term that is relevant to each topic search engine with the term that has been acquired by query expansion;
matching said acquired term with said search keyword; and
presenting to the user a topic search engine corresponding to the term determined by matching to have high goodness-of-fit, wherein said goodness-of-fit is calculated by where f(x,y) is a function that is 1 when character strings x and y are equal and 0 when character strings x and y are not equal, Ei are search engines, Gc are groups comprising search keywords associated with a topic, Wik are degrees of importance, Ccg are frequencies of occurrence, sik are relevant terms, and xcg are expanded terms.
-
-
8. An information search method that presents to a user a topic search engine that is relevant to a search keyword that has been submitted by said user, comprising the steps of:
-
acquiring a term that is relevant to content of a topic search engine that is present on the Web from another Web page that has a hyperlink pointing to that topic search engine;
acquiring a term that is relevant to said search keyword by means of query expansion;
matching said acquired term with said search keyword; and
presenting to the user a topic search engine that corresponds to the term determined by matching to have high goodness-of-fit, wherein said goodness-of-fit is calculated by where f(x,y) is a function that is 1 when character strings x and y are equal and 0 when character strings x and y are not equal, Ei are search engines, Gc are groups comprising search keywords associated with a topic, Wik are degrees of importance, Ccg are frequencies of occurrence, sik are relevant terms, and xcg are expanded terms.
-
-
9. An information search method, comprising the steps of:
-
sending a search keyword that has been submitted by a user to a general-purpose search engine;
extracting a phrase from information returned by the general-purpose search engine;
generating an index of search engines;
calculating a degree of importance of said extracted phrase in relation to the search keyword;
selecting a phrase having a highest degree of importance as a phrase that explains said search keyword;
matching a search engine from the index of search engines with the selected phrase; and
presenting to said user the selected phrase together with the search engine, wherein in the step of calculating a degree of importance, the degree of importance is based on the goodness-of-fit between the keyword and the search engine, said goodness of fit calculated by where f(x,y) is a function that is 1 when character strings x and y are equal and 0 when character strings x and y are not equal, Ei are search engines, Gc are groups comprising search keywords associated with a topic, Wik are degrees of importance, Ccg are frequencies of occurrence, sik are relevant terms, and xcg are expanded terms.
-
Specification