Distributing content indices
First Claim
1. A method for distributing content indices for distributed content corpora, the method comprising:
- analyzing search queries for partitioning individual terms thereof into sets based on a similarity characteristic;
mapping said sets to segregated locations; and
at said segregated locations, storing linking indices for each content corpus of said distributed content corpora when said content corpus has said individual terms therein.
1 Assignment
0 Petitions
Accused Products
Abstract
A query-centric system and process for distributing reverse indices for a distributed content system. Relevance ranking techniques in organizing distributed system indices. Query-centric configuration subprocesses (1) analyze query data, partitioning terms for reverse index server(s) (RIS), (2) distribute each partitioned data set by generally localizing search terms for the RIS that have some query-centric correlation, and (3) generate and maintain a map for the partitioned reverse index system terms by mapping the terms for the reverse index to a plurality of different index server nodes. Indexing subprocess element builds distributed reverse indices from content host indices. Routines of the query execution use the map derived in the configuration to more efficiently return more relevant search results to the searcher.
165 Citations
27 Claims
-
1. A method for distributing content indices for distributed content corpora, the method comprising:
-
analyzing search queries for partitioning individual terms thereof into sets based on a similarity characteristic;
mapping said sets to segregated locations; and
at said segregated locations, storing linking indices for each content corpus of said distributed content corpora when said content corpus has said individual terms therein. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A distributed system having interconnected nodes performing functions for storing retrievable content, nodes performing functions for searching for said retrievable content, and nodes performing functions for facilitating searching, wherein each of said nodes may perform one or more of said functions, the system comprising:
at said nodes performing functions for facilitating searching, a plurality of distributed, query-centric indices, wherein each index of said indices is identified by a plurality of search terms having a similarity characteristic and wherein each distributed index thereof contains a listing of links to retrievable content having tokens matching at least one of the search terms identifying said indices, a distribution map directing search terms to associated ones of said indices matching said search terms, a facility for resolving matches of a current search term from one of said nodes for searching to a plurality of said distributed, query-centric indices and returning results of said resolving to said one of said nodes for searching. - View Dependent Claims (15, 16, 17, 18, 19)
-
20. A method of doing business of facilitating Internet searching, the method comprising:
-
storing in segregated form and segregated locations, correlated query-centric indices listing links to substantially each source of distributed content related thereto; and
maintaining and providing a map directing content tokens and new search queries to associated said segregated locations. - View Dependent Claims (21, 22)
-
-
23. A reverse index system comprising:
-
means for query-centric partitioning of Internet-search query terms for a reverse index system;
means for distributing query-centric partitioned sets of the Internet-search query terms derived by said means for query-centric partitioning to a plurality of index servers of said reverse index system, including mapping of the Internet-search query terms to the plurality of index servers as a distribution map;
means for using said distribution map in conjunction with Internet content nodes to index links to said content nodes to said query-centric partitioned sets of the Internet-search query terms; and
means for using said distribution map in conjunction with Internet browser nodes for processing queries received therefrom.
-
-
24. Computer memory comprising:
-
computer code means for calculating a factor indicative of similarity of a plurality of received query search terms;
computer code means for partitioning said received query search terms into sets of query search terms based on said factor indicative of similarity;
computer code means for deriving a map for distributing said sets of query search terms to a plurality of separate files;
computer code means for receiving content tokens related to distributed system content at an associated content site link, for matching said content tokens to said sets of query search terms via said map, and for storing each said link with a matched said set of query search terms;
computer code means for directing all subsequent query search terms of an individual query according to said map and for returning links resolving every term of said subsequent query search terms.
-
-
25. A method for configuring a query-centric reverse index system, the method comprising:
-
maintaining a log of received queries;
dividing said query terms into individual query terms;
determining similarities between said query terms;
building a construct having nodes with edges connecting nodes such that said edges provide a frequency weight for co-occurrence of nodes connected;
partitioning said construct into said sets for minimizing total frequency weight of query terms having edges crossing partition boundaries;
distributing said sets to reverse index system servers; and
maintaining a map of said distributing said sets.
-
-
26. A method for sharing content comprising:
-
maintaining a log of received queries;
dividing said query terms into individual query terms;
determining similarities between said query terms;
representing said terms as a construct having nodes with edges connecting nodes such that said edges provide a frequency weight for co-occurrence of nodes connected;
partitioning said construct into said sets for minimizing total frequency weight for query being connected nodes having edges crossing partition boundaries;
distributing said sets to reverse index system servers;
maintaining a map of said distributing said sets;
comparing tokens representative of local content to said map; and
combining links to said local content to said reverse index system servers having said sets having a match to said tokens.
-
-
27. A method for providing distributed content links in response to a search request, the method comprising:
-
maintaining a log of received queries;
dividing said query terms into individual query terms;
determining similarities between said query terms;
representing said terms as a construct having nodes with edges connecting nodes such that said edges provide a frequency weight for co-occurrence of nodes connected;
partitioning said construct into said sets for minimizing total frequency weight for query being connected nodes having edges crossing partition boundaries;
distributing said sets to reverse index system servers;
maintaining a map of said distributing said sets;
comparing tokens representative of local content to said map;
combining links to said local content to said reverse index system servers having said sets having a match to said tokens such that said links are associated with matched sets;
receiving a current query;
comparing terms of said current query to said map;
based on matches in said comparing, routing terms of said current query to appropriate reverse index system servers;
retrieving matches of terms to said links;
resolving matches of terms to said links such that only links matching said current query are returned.
-
Specification