Dynamic repartitioning for changing a number of nodes or partitions in a distributed search system
First Claim
Patent Images
1. A distributed search system executed by a processor comprising:
- a group of storage nodes assigned to a plurality of different partitions, each partition comprising one or more nodes, each partition storing a different subset of document based records from a set of documents distributed across the plurality of different partitions, each node of each partition storing an index for the subset of documents stored in that partition, wherein;
each of the storage nodes in the same partition independently process the document-based records to construct the indexes,each of the storage nodes is adapted to perform a repartition to change a number of partitions in the plurality of partitions or a number of storage nodes in one or more partitions of the plurality of partitions by processing a stored checkpoint in a central queue of document based records to be processed by the nodes of one or more of the partitions to produce a repartitioned checkpoint,the stored checkpoint and the repartitioned checkpoint each include index and document data for documents stored in the storage nodes and a synchronized lexicon of decoding information for each of the partitions, wherein each node maintains the decoding information of the synchronized lexicon for each partition, wherein each node decodes combined query results based on the decoding information of the synchronized lexicon of the partitions in which the combined query results are stored, andthe group of storage nodes responds to search and index update requests during the construction of the repartitioned checkpoint, andthe repartitioned checkpoint is loaded into the group of storage nodes to dynamically repartition the group of storage nodes.
2 Assignments
0 Petitions
Accused Products
Abstract
A distributed search system can include a group of nodes assigned to different partitions. Each partition can store indexes for a group of documents. Nodes in the same partition can independently processing document-based records to construct the indexes. One of the nodes can process a stored checkpoint to produce a repartitioned checkpoint. The group of nodes can respond to search requests during the construction of the repartitioned checkpoint. The repartitioned checkpoint can be loaded into the group of nodes to repartition the group of nodes.
-
Citations
22 Claims
-
1. A distributed search system executed by a processor comprising:
a group of storage nodes assigned to a plurality of different partitions, each partition comprising one or more nodes, each partition storing a different subset of document based records from a set of documents distributed across the plurality of different partitions, each node of each partition storing an index for the subset of documents stored in that partition, wherein; each of the storage nodes in the same partition independently process the document-based records to construct the indexes, each of the storage nodes is adapted to perform a repartition to change a number of partitions in the plurality of partitions or a number of storage nodes in one or more partitions of the plurality of partitions by processing a stored checkpoint in a central queue of document based records to be processed by the nodes of one or more of the partitions to produce a repartitioned checkpoint, the stored checkpoint and the repartitioned checkpoint each include index and document data for documents stored in the storage nodes and a synchronized lexicon of decoding information for each of the partitions, wherein each node maintains the decoding information of the synchronized lexicon for each partition, wherein each node decodes combined query results based on the decoding information of the synchronized lexicon of the partitions in which the combined query results are stored, and the group of storage nodes responds to search and index update requests during the construction of the repartitioned checkpoint, and the repartitioned checkpoint is loaded into the group of storage nodes to dynamically repartition the group of storage nodes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
9. A non-transitory computer readable medium having stored thereon a series of instructions which, when executed by a processor, cause the processor to perform a repartition in a distributed search system by:
-
assigning a group of storage nodes to a plurality of different partitions, each partition comprising one or more nodes, each partition storing a different subset of document based records from a set of documents distributed across the plurality of different partitions, each node of each partition storing an index for the subset of documents stored in that partition, wherein each of the storage nodes in the same partition independently process the document- based records to construct the indexes, and wherein each of the storage nodes is adapted to perform a repartition to change a number of partitions in the plurality of partitions or a number of storage nodes in one or more partitions of the plurality of partitions; performing a reparation by one of the storage nodes by processing a stored checkpoint in a central queue of document based records to be processed by the nodes of one or more of the partitions to produce a repartitioned checkpoint, wherein the stored checkpoint and the repartitioned checkpoint each include index and document data for documents stored in the storage nodes and a synchronized lexicon of decoding information for each of the partitions, wherein each node maintains the decoding information of the synchronized lexicon for each partition, wherein each node decodes combined query results based on the decoding information of the synchronized lexicon of the partitions in which the combined query results are stored, and wherein the group of storage nodes responds to search and index update requests during the construction of the repartitioned checkpoint, and wherein the repartitioned checkpoint is loaded into the group of storage nodes to dynamically repartition the group of storage nodes; and loading the repartitioned checkpoint into a group of storage nodes to dynamically repartition the groups of storage nodes. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A method for performing a repartition of a distributed search system, the method comprising:
-
assigning a group of storage nodes to a plurality of different partitions, each partition comprising one or more nodes, each partition storing a different subset of document based records from a set of documents distributed across the plurality of different partitions, each node of each partition storing an index for the subset of documents stored in that partition, wherein each of the storage nodes in the same partition independently process the document-based records to construct the indexes, and wherein each of the storage nodes is adapted to perform a repartition to change a number of partitions in the plurality of partitions or a number of storage nodes in one or more partitions of the plurality of partitions; performing a reparation by one of the storage nodes by processing a stored checkpoint in a central queue of document based records to be processed by the nodes of one or more of the partitions to produce a repartitioned checkpoint, wherein the stored checkpoint and the repartitioned checkpoint each include index and document data for documents stored in the storage nodes and a synchronized lexicon of decoding information for each of the partitions, wherein each node maintains the decoding information of the synchronized lexicon for each partition, wherein each node decodes combined query results based on the decoding information of the synchronized lexicon of the partitions in which the combined query results are stored, and wherein the group of storage nodes responds to search and index update requests during the construction of the repartitioned checkpoint, and wherein the repartitioned checkpoint is loaded into the group of storage nodes to dynamically repartition the group of storage nodes; and loading the repartitioned checkpoint into a group of storage nodes to dynamically repartition the groups of storage nodes. - View Dependent Claims (17, 18, 19, 20, 21, 22)
-
Specification