Building and using subwebs for focused search
First Claim
1. A system that facilitates searching, comprising:
- a display component that presents a plurality of subwebs to search on over a collection of items, the plurality of subwebs respectively are associated with a plurality of subsets of the items, wherein each of the items is weighted by relevance and displayed in an order based in part on distribution paths associated with each of the items, and wherein each item of a respective subweb of the plurality of subwebs is given a specified priority based on an amount of usage and an item of a specified priority level is crawled at a higher frequency than items of a priority level that is lower than the specified priority level or items that are not associated with a subweb;
a search component that searches the collection of items based in part on a received search-word query and ranks each item of the collection of items;
a subweb selector component that selects at least one subweb of the plurality of subwebs based in part on the search-word query received by the searching component, wherein each item returned by the search-word query is assigned a combined rank and correspondingly ordered among other returned items based in part on the rank assigned by the search component and the relevance weight assigned to a respective item that is based in part on distribution paths associated with the respective item, the distribution paths determined based in part on a number of instances the respective item was returned as a result of disparate search-word queries, and inlinks and outlinks related to the respective item; and
an input component that receives the search-word query over at least one of the subwebs.
3 Assignments
0 Petitions
Accused Products
Abstract
A system that facilitates performance of a focused search over a collection of sites comprises a subweb that corresponds to a topic and/or user characteristic(s) that are of interest to the user. The subweb includes a plurality of domains and/or paths (e.g. sites) that are related to the topic and/or the user characteristic(s). Each of the sites within the subweb is assigned a weight that indicates relevance of the site to the desirable topic and/or user characteristic(s). A search engine employs the subweb to facilitate focusing a search over a collection of sites. The search engine receives a query, and utilizes the subweb to focus a search over the selection of sites corresponding to the topic and/or user characteristic(s) represented by the subweb. The results from the search are returned to the user based at least in part upon the relevance weights assigned to the sites within the subweb.
-
Citations
40 Claims
-
1. A system that facilitates searching, comprising:
-
a display component that presents a plurality of subwebs to search on over a collection of items, the plurality of subwebs respectively are associated with a plurality of subsets of the items, wherein each of the items is weighted by relevance and displayed in an order based in part on distribution paths associated with each of the items, and wherein each item of a respective subweb of the plurality of subwebs is given a specified priority based on an amount of usage and an item of a specified priority level is crawled at a higher frequency than items of a priority level that is lower than the specified priority level or items that are not associated with a subweb; a search component that searches the collection of items based in part on a received search-word query and ranks each item of the collection of items; a subweb selector component that selects at least one subweb of the plurality of subwebs based in part on the search-word query received by the searching component, wherein each item returned by the search-word query is assigned a combined rank and correspondingly ordered among other returned items based in part on the rank assigned by the search component and the relevance weight assigned to a respective item that is based in part on distribution paths associated with the respective item, the distribution paths determined based in part on a number of instances the respective item was returned as a result of disparate search-word queries, and inlinks and outlinks related to the respective item; and an input component that receives the search-word query over at least one of the subwebs. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system that facilitates performance of a focused search over a collection of sites, comprising:
-
a subweb, the subweb corresponding to at least one of a particular topic or a user characteristic, the subweb comprising a plurality of sites related to the at least one of the topic or the user characteristic, or a combination thereof, each site of the plurality of sites is assigned a relevance weight that indicates relevance of the site to the at least one of the topic or the user characteristic, the relevance weight determined based in part on distribution paths associated with the site, the distribution paths determined based in part on a number of instances the site was returned as a result of disparate search-word queries, and inlinks and outlinks related to the site; a crawler component that crawls the plurality of sites of the subweb on a more frequent basis than other sites of the collection of sites that are not contained in the subweb; a search component that receives a search-word query, the search component employing the subweb to focus a search over the collection of sites based upon the search-word query, the search component returns a subset of sites results based at least in part upon the subweb and the search-word query and respective ranks each returned site, wherein each returned site is assigned a combined rank and correspondingly ordered among other sites based in part on the respective rank assigned to each returned site by the search component and the respective relevance weight assigned to each returned site; and a subweb selector component that selects the subweb based in part on the search-word query. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
-
-
27. A method for performing a focused search, comprising:
-
providing a subweb that is defined by a plurality of sites with relevance to at least one of a topic or user characteristic represented by the subweb, each site of the plurality of sites assigned a relevance weight that indicates each site'"'"'s relevance to the at least one of the topic or the user characteristic, the relevance weight determined based in part on distribution paths respectively associated with each site, the distribution paths determined based in part on a number of instances a respective site was returned as a result of disparate search-word queries, and inlinks and outlinks related to the respective site; selecting the subweb based in part on the search-word query; relaying the search-word query related to the at least one of the topic or the user characteristic represented by the subweb to a search engine; searching a collection of sites for information based upon the search-word query; obtaining search results comprising a subset of sites based at least in part upon the search-word query; assigning a combined rank to each site obtained by the search-word query and correspondingly ordering each obtained site among other obtained sites based in part on the respective ranks assigned by the search engine and the respective relevance weights assigned to each obtained site; and crawling the plurality of sites of the subweb on a more frequent basis than other sites of the collection of sites that are not contained in the subweb. - View Dependent Claims (28, 29, 30, 31, 32, 33, 34)
-
-
35. A system for searching a collection of sites, comprising:
-
means for generating a topic-specific subweb, the topic-specific subweb comprising a plurality of sites related to the topic, each site of the plurality of sites assigned a relevance weight according to relevance of the site to the topic, the relevance weight based in part on a distribution paths respectively associated with each site, the distribution path determined based in part on a number of instances each site was returned as a result of disparate search-word queries, and inlinks and outlinks related to each site; means for employing the subweb in connection with a search engine to search the collection of sites; means for selecting the subweb based in part on a search-word query; means for ranking each site returned as a result of the search-word query, wherein each site returned in response to the search-word query is assigned a combined rank and correspondingly ordered among other returned sites based innart on the respective rank assigned by the search engine and the relevance weisht assigned to a respective returned site; means for crawling the collection of sites, the means for crawling crawls sites in the topic-specific subweb at a higher frequency than sites that are not contained in the topic-specific subweb; and means for determining a probability of a change to at least one site, the means for determining a probability of a change employs probabilistic-based analysis to determine the probability that a change has been made to the at least one site, the change is at least one of an alteration, a deletion, an addition of an inlink or outlinks, or a combination thereof, with regard to the at least one site. - View Dependent Claims (36, 37, 38, 39, 40)
-
Specification