Domain specific knowledge-based metasearch system and methods of using
First Claim
1. A method of performing a domain-specific metasearch and obtaining search results therefrom, said method comprising the steps of:
- providing a metasearch engine capable of accessing generic, web-based search engines and domain-relevant search engines;
receiving a query inputted by a user to the metasearch engine and searching for documents on a selected set of said generic, web-based search engines and domain-relevant search engines which are relevant to the query;
fetching raw data search results in the form of text documents from each member of the selected set;
displaying the raw data on a user interface;
supplying the raw data to a data mining module, wherein the data mining module forms clusters of related documents according to an unsupervised clustering procedure and wherein the data mining module, upon receiving the raw data, processes the raw data, independently of the unsupervised clustering procedure, and prepares a single list of all the documents, after eliminating documents not reachable via the web; and
displaying the clusters of related documents on the user interface.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for performing domain-specific knowledge based metasearches. A metasearch engine is provided for accessing a searching text-based documents using generic search engines while simultaneously being able to access publication based databases and sequence databases as well as in-house proprietary databases and any database capable of being interfaced with a web interface so as to produce search results in text format. A data mining module is also provided for organizing raw data obtained by unsupervised clustering, simple relevance ranking, and categorization, all of which are done independently of one another. The system is capable of storing previous search data for use in query refinement or subsequent searches based upon the stored data. A search results collection browser may be provided for analyzing current browsing patterns of the user for developing weighting factors to be used in ordering the results of future searches.
-
Citations
52 Claims
-
1. A method of performing a domain-specific metasearch and obtaining search results therefrom, said method comprising the steps of:
-
providing a metasearch engine capable of accessing generic, web-based search engines and domain-relevant search engines;
receiving a query inputted by a user to the metasearch engine and searching for documents on a selected set of said generic, web-based search engines and domain-relevant search engines which are relevant to the query;
fetching raw data search results in the form of text documents from each member of the selected set;
displaying the raw data on a user interface;
supplying the raw data to a data mining module, wherein the data mining module forms clusters of related documents according to an unsupervised clustering procedure and wherein the data mining module, upon receiving the raw data, processes the raw data, independently of the unsupervised clustering procedure, and prepares a single list of all the documents, after eliminating documents not reachable via the web; and
displaying the clusters of related documents on the user interface. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28)
-
-
29. A method of performing a life science-specific domain-specific metasearch and obtaining search results therefrom, the method comprising the steps of:
-
providing a metasearch engine capable of accessing generic, web-based search engines, publication sites, sequences sites, protein structure databases and pathway information databases;
receiving a query inputted by a user to the metasearch engine and searching for documents on a selected set of the generic, web-based search engines, publications sites, sequences sites, protein structure databases and pathway information databases which are relevant to the query;
fetching raw data search results in the form of text documents from each member of the selected set;
displaying the raw data search results on a user interface;
supplying the raw data to a data mining module specifically for the life sciences, wherein the data mining module prepares a single list of all of the documents, after eliminating documents not reachable via the web, and assigns simple relevance scores to the documents prepared in the single list;
forms clusters of related documents according to an unsupervised clustering procedure; and
categorizes the documents so that each document is assigned to one of a predefined number of categories; and
displaying the documents in a format defined by the single list, in a format defined by the clusters, and in a format defined by the categories on the user interface so that a user can choose to browse the documents according to the list format, cluster format or categories format.
-
-
30. A method of performing a domain-specific metasearch and obtaining search results therefrom, said method comprising the steps of:
-
providing a metasearch engine capable of accessing generic, web-based search engines and domain-relevant search engines;
receiving a query inputted by a user to the metasearch engine and searching for documents on a selected set of said generic, web-based search engines and domain-relevant search engines which are relevant to the query;
fetching raw data search results in the form of text documents from each member of the selected set;
supplying the raw data to a data mining module, wherein the data mining module forms clusters of related documents according to an unsupervised clustering procedure, data mining module, upon receiving the new data, processes the raw data, independently of the unsupervised clustering procedure, and prepares a single list of all of the documents, after eliminating documents not reachable via the web and wherein the data mining module categorizes the documents so that each document is assigned to one of a predefined number of categories; and
displaying the documents in a format defined by the clusters, and in a format defined by the categories on a user interface so that a user can choose to browse the documents according to the cluster format or the categories format. - View Dependent Claims (31)
-
-
32. A computer system for searching both general and domain-specific information resources simultaneously pursuant to a user query and for obtaining organized search results therefrom, the system comprising:
-
a metasearch engine capable of accessing a plurality of sites including generic, web-based search engines and domain-relevant search engines, for receiving documents from said plurality of sites in response to the user query;
means for selecting particular search engines from a plurality of generic, web-based search engines and domain-relevant search engines that are presented to a user;
means for displaying the received documents to the user;
means for assembling the received documents from the plurality of sites searched by the selected particular search engines into a single list after eliminating documents not reachable via the web;
means for assigning relevance ranks to the received documents in the single list and for organizing the documents in the single list according to said relevance ranks;
means for clustering the received documents into clusters according to an unsupervised clustering procedure;
and means for displaying said single list and said clusters to the user. - View Dependent Claims (33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45)
-
-
46. A computer system for searching both general and domain-specific information resources simultaneously pursuant to a user query and for obtaining organized search results therefrom, the system comprising:
-
a metasearch engine capable of accessing a plurality of sites including generic, web-based search engines and domain-relevant search engines for receiving documents from said plurality of sites in response to the user query;
means for selecting particular search engines from a plurality of generic, web-based search engines and domain-relevant search engines that are presented to a user;
means for clustering the received documents into clusters according to an unsupervised clustering procedure;
means for preparing a single list of all the documents, independently of said forming clusters, after oliminating documents not reachable via the web;
means for categorizing the received documents, so that each document is assigned to one of a predefined number of categories; and
means for displaying said clusters, said categories and said documents assigned thereto to the user. - View Dependent Claims (47, 48)
-
-
49. A computer readable medium carrying one or more sequences of instructions from a user of a computer system for searching both general and domain-specific information resources simultaneously to obtain organized search results therefrom, wherein execution of the one or more sequences of instructions by one or more processors causes the one or more processors to perform the steps of:
-
receiving a query inputted by the user and receiving instructions as to which databases to access;
accessing selected sites using generic, web-based search engines and domain-relevant search engines, based upon said instructions received from the user, and searching for documents on the selected sites, which are relevant to said query;
fetching raw data search results in the form of text documents from each of the selected sites;
displaying said raw data on a user interface;
forming clusters of related documents form said raw data, according to an unsupervised clustering procedure and processing said raw data, independently of said unsupervised clustering procedure, and categorizing said documents so that each document is assigned to one of a predefined number of categories; and
displaying said clusters of related documents on the user interface. - View Dependent Claims (50, 51, 52)
-
Specification