Shard splitting
First Claim
1. A computer-implemented method for splitting a shard comprising:
- marking a source index as read only, the source index comprising a source shard, the source shard comprising a source reference;
creating a target index, the target index comprising target shards, each target shard of the target shards comprising a target reference of target references;
copying the source reference, the copying producing the target references;
hashing identifiers in the source reference, each identifier being associated with a document of a plurality of documents of the source shard, the hashing assigning each document of the plurality of documents to a target shard of the target shards, the plurality of documents being stored in a file associated with the source reference;
deleting at least some documents of the plurality of documents in the target references, the at least some documents belonging in a different target shard of the target shards;
assigning, for each identifier in the target shard, a status value in a status column of the target index based on the associated document of the target shard being alive or deleted;
hard linking the file into the target references, the hard linking comprising referencing the status column in the target index, wherein the hard linking enables the file to be selectively referenced from at least a first target reference and a second target reference of the target references;
when no error is reported by at least one of a file system and an application;
marking the target index as read-write, such that the target index is used in place of the source index; and
deleting the source index; and
when an error is reported by at least one of the file system and the application, marking the source index as read-write.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods and systems for shard splitting are provided. Exemplary methods include: marking a source index as read only, the source index comprising a source shard, the source shard comprising a source reference; creating a target index, the target index comprising target shards, each target shard of the target shards comprising a target reference of target references; copying the source reference, the copying producing the target references; hashing identifiers in the source reference, each identifier being associated with a document of a plurality of documents of the source shard, the hashing assigning each document of the plurality of documents to a target shard of the target shards, the plurality of documents being stored in a file associated with the source reference; hard linking the file into the target references; marking the target index as read-write; and deleting the source index.
65 Citations
18 Claims
-
1. A computer-implemented method for splitting a shard comprising:
-
marking a source index as read only, the source index comprising a source shard, the source shard comprising a source reference; creating a target index, the target index comprising target shards, each target shard of the target shards comprising a target reference of target references; copying the source reference, the copying producing the target references; hashing identifiers in the source reference, each identifier being associated with a document of a plurality of documents of the source shard, the hashing assigning each document of the plurality of documents to a target shard of the target shards, the plurality of documents being stored in a file associated with the source reference; deleting at least some documents of the plurality of documents in the target references, the at least some documents belonging in a different target shard of the target shards; assigning, for each identifier in the target shard, a status value in a status column of the target index based on the associated document of the target shard being alive or deleted; hard linking the file into the target references, the hard linking comprising referencing the status column in the target index, wherein the hard linking enables the file to be selectively referenced from at least a first target reference and a second target reference of the target references; when no error is reported by at least one of a file system and an application; marking the target index as read-write, such that the target index is used in place of the source index; and deleting the source index; and when an error is reported by at least one of the file system and the application, marking the source index as read-write. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system for splitting a shard comprising:
-
a processor; and a memory coupled to the processor, the memory storing instructions executable by the processor to; mark a source index as read only, the source index comprising a source shard, the source shard comprising a source reference; create a target index, the target index comprising target shards, each target shard of the target shards comprising a target reference of target references; copy the source reference, the copying producing the target references; hash identifiers in the source reference, each identifier being associated with a document of a plurality of documents of the source shard, the hashing assigning each document of the plurality of documents to a target shard of the target shards, the plurality of documents being stored in a file associated with the source reference; delete at least some documents of the plurality of documents in the target references, the at least some documents belonging in a different target shard of the target shards; assign, for each identifier in the target shard, a status value in a status column of the target index based on the associated document of the target shard being alive or deleted; hard link the file into the target references, the hard link process comprising referencing the status column in the target index, wherein the hard link enables the file to be selectively referenced from at least a first target reference and a second target reference of the target references; when no error is reported by at least one of a file system and an application; mark the target index as read-write, such that the target index is used in place of the source index; and delete the source index; and when an error is reported by at least one of the file system and the application, mark the source index as read-write. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
-
-
18. A non-transitory computer-readable medium having embodied thereon a program, the program being executable by a processor to perform a method for splitting a shard, the method comprising:
-
marking a source index as read only, the source index comprising a source shard, the source shard comprising a source reference; creating a target index, the target index comprising target shards, each target shard of the target shards comprising a target reference of target references; copying the source reference, the copying producing the target references; hashing identifiers in the source reference, each identifier being associated with a document of a plurality of documents of the source shard, the hashing assigning each document of the plurality of documents to a target shard of the target shards, the plurality of documents being stored in a file associated with the source reference; deleting at least some documents of the plurality of documents in the target references, the at least some documents belonging in a different target shard of the target shards; assigning, for each identifier in the target shard, a status value in a status column of the target index based on the associated document of the target shard being alive or deleted; hard linking the file into the target references, the hard linking comprising referencing the status column in the target index, wherein the hard linking enables the file to be selectively referenced from at least a first target reference and a second target reference of the target references; when no error is reported by at least one of a file system and an application; marking the target index as read-write, such that the target index is used in place of the source index; and deleting the source index; and when an error is reported by at least one of the file system and the application, marking the source index as read-write.
-
Specification