Systems and methods for maintaining distributed data
First Claim
Patent Images
1. A distributed storage system storing a distributed tree structure, comprising:
- a plurality of storage devices configured to communicate with each other;
a tree structure stored as nodes across two or more of the plurality of storage devices, the tree structure comprising;
a first leaf node comprising a first record identified by a first key, the first leaf node stored on one of the plurality of storage devices, the first record stored on one of the plurality of storage devices;
at least one copy of the first leaf node comprising a copy of the first record identified by the first key, the at least one copy of the first leaf node stored on one of the plurality of storage devices that is different from the storage device storing the first leaf node;
a second leaf node comprising a second record identified by a second key, the second leaf node stored on one of the plurality of storage devices, the second record stored on one of the plurality of storage devices;
one or more copies of the second leaf node, the one or more copies of the second leaf node comprising a different number of copies than the at least one copy of the first leaf node, each of the one or more copies of the second leaf node comprising a copy of the second record identified by the second key, each of the one or more copies of the second leaf node stored on one of the plurality of storage devices that is different from the storage device storing the second leaf node;
a parent node referencing the first leaf node, the at least one copy of the first leaf node, the second leaf node, and the one or more copies of the second leaf node, the parent node stored on one of the plurality of storage devices;
at least one copy of the parent node referencing the first leaf node, the at least one copy of the first leaf node, the second leaf node, and the one or more copies of the second leaf node, the at least one copy of the parent node stored on one of the plurality of storage devices that is different from the storage device storing the parent node;
a grandparent node referencing the parent node and the at least one copy of the parent node, the grandparent node stored on one of the plurality of storage devices;
at least one copy of the grandparent node referencing the parent node and the at least one copy of the parent node, the at least one copy of the grandparent node stored on one of the plurality of storage devices that is different from the storage device storing the grandparent node; and
the grandparent node, the at least one copy of the grandparent node, the parent node, and the at least one copy of the parent node configured to index the first leaf node and the at least one copy of the first leaf node based on the first key and the second leaf node and the one or more copies of the second leaf node based on the second key; and
an index management module in communication with the plurality of storage devices, the index management module configured to run on a processor and to;
maintain at least as many copies of the parent node as the maximum number of copies of any of its children nodes, its children nodes including the first leaf node and the second leaf node;
maintain at least as many copies of the grandparent node as the maximum number of copies of any of its children nodes, its children nodes including the parent node; and
traverse the tree structure, using a copy of the first leaf node if the first leaf node is unavailable, using a copy of the parent node if the parent node is unavailable, and using a copy of the grandparent node if the grandparent node is unavailable.
14 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are disclosed that provide an indexing data structure. In one embodiment, the indexing data structure is mirrored index tree where the copies of the nodes of the tree are stored across devices in a distributed system. In one embodiment, nodes that are stored on an offline device are restored, and an offline device that comes back online is merged into the distributed system and given access to the current indexing data structure. In one embodiment, the indexing data structure is traversed to locate and restore nodes that are stored on offline devices of the distributed system.
-
Citations
15 Claims
-
1. A distributed storage system storing a distributed tree structure, comprising:
-
a plurality of storage devices configured to communicate with each other; a tree structure stored as nodes across two or more of the plurality of storage devices, the tree structure comprising; a first leaf node comprising a first record identified by a first key, the first leaf node stored on one of the plurality of storage devices, the first record stored on one of the plurality of storage devices; at least one copy of the first leaf node comprising a copy of the first record identified by the first key, the at least one copy of the first leaf node stored on one of the plurality of storage devices that is different from the storage device storing the first leaf node; a second leaf node comprising a second record identified by a second key, the second leaf node stored on one of the plurality of storage devices, the second record stored on one of the plurality of storage devices; one or more copies of the second leaf node, the one or more copies of the second leaf node comprising a different number of copies than the at least one copy of the first leaf node, each of the one or more copies of the second leaf node comprising a copy of the second record identified by the second key, each of the one or more copies of the second leaf node stored on one of the plurality of storage devices that is different from the storage device storing the second leaf node; a parent node referencing the first leaf node, the at least one copy of the first leaf node, the second leaf node, and the one or more copies of the second leaf node, the parent node stored on one of the plurality of storage devices; at least one copy of the parent node referencing the first leaf node, the at least one copy of the first leaf node, the second leaf node, and the one or more copies of the second leaf node, the at least one copy of the parent node stored on one of the plurality of storage devices that is different from the storage device storing the parent node; a grandparent node referencing the parent node and the at least one copy of the parent node, the grandparent node stored on one of the plurality of storage devices; at least one copy of the grandparent node referencing the parent node and the at least one copy of the parent node, the at least one copy of the grandparent node stored on one of the plurality of storage devices that is different from the storage device storing the grandparent node; and the grandparent node, the at least one copy of the grandparent node, the parent node, and the at least one copy of the parent node configured to index the first leaf node and the at least one copy of the first leaf node based on the first key and the second leaf node and the one or more copies of the second leaf node based on the second key; and an index management module in communication with the plurality of storage devices, the index management module configured to run on a processor and to; maintain at least as many copies of the parent node as the maximum number of copies of any of its children nodes, its children nodes including the first leaf node and the second leaf node; maintain at least as many copies of the grandparent node as the maximum number of copies of any of its children nodes, its children nodes including the parent node; and traverse the tree structure, using a copy of the first leaf node if the first leaf node is unavailable, using a copy of the parent node if the parent node is unavailable, and using a copy of the grandparent node if the grandparent node is unavailable. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer-implemented method for storing a distributed tree structure, comprising:
-
providing, by a processor, a tree structure stored as nodes across two or more of a plurality of storage devices, the tree structure comprising; a first leaf node comprising a first record identified by a first key, the first leaf node stored on one of the plurality of storage devices, the first record stored on one of the plurality of storage devices; at least one copy of the first leaf node comprising a copy of the first record identified by the first key, the at least one copy of the first leaf node stored on one of the plurality of storage devices that is different from the storage device storing the first leaf node; a second leaf node comprising a second record identified by a second key, the second leaf node stored on one of the plurality of storage devices, the second record stored on one of the plurality of storage devices; one or more copies of the second leaf node, the one or more copies of the second leaf node comprising a different number of copies than the at least one copy of the first leaf node, each of the one or more copies of the second leaf node comprising a copy of the second record identified by the second key, each of the one or more copies of the second leaf node stored on one of the plurality of storage devices that is different from the storage device storing the second leaf node; a parent node referencing the first leaf node, the at least one copy of the first leaf node, the second leaf node, and the one or more copies of the second leaf node, the parent node stored on one of the plurality of storage devices; at least one copy of the parent node referencing the first leaf node, the at least one copy of the first leaf node, the second leaf node, and the one or more copies of the second leaf node, the at least one copy of the parent node stored on one of the plurality of storage devices that is different from the storage device storing the parent node; a grandparent node referencing the parent node and the at least one copy of the parent node, the grandparent node stored on one of the plurality of storage devices; at least one copy of the grandparent node referencing the parent node and the at least one copy of the parent node, the at least one copy of the grandparent node stored on one of the plurality of storage devices that is different from the storage device storing the grandparent node; and the grandparent node, the at least one copy of the grandparent node, the parent node, and the at least one copy of the parent node configured to index the first leaf node and the at least one copy of the first leaf node based on the first key and the second leaf node and the one or more copies of the second leaf node based on the second key; and maintaining, by a processor, at least as many copies of the parent node as the maximum number of copies of any of its children nodes, its children nodes including the first leaf node and the second leaf node; maintaining, by a processor, at least as many copies of the grandparent node as the maximum number of copies of any of its children nodes, its children nodes including the parent node; and traversing, by a processor, the tree structure, using a copy of the first leaf node if the first leaf node is unavailable, using a copy of the parent node if the parent node is unavailable, and using a copy of the grandparent node if the grandparent node is unavailable. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A computer-readable storage medium having instructions stored thereon for implementing, when the instructions are executed, a distributed tree structure, the instructions comprising:
-
providing a tree structure stored as nodes across two or more of a plurality of storage devices, the tree structure comprising; a first leaf node comprising a first record identified by a first key, the first leaf node stored on one of the plurality of storage devices, the first record stored on one of the plurality of storage devices; at least one copy of the first leaf node comprising a copy of the first record identified by the first key, the at least one copy of the first leaf node stored on one of the plurality of storage devices that is different from the storage device storing the first leaf node; a second leaf node comprising a second record identified by a second key, the second leaf node stored on one of the plurality of storage devices, the second record stored on one of the plurality of storage devices; one or more copies of the second leaf node, the one or more copies of the second leaf node comprising a different number of copies than the at least one copy of the first leaf node, each of the one or more copies of the second leaf node comprising a copy of the second record identified by the second key, each of the one or more copies of the second leaf node stored on one of the plurality of storage devices that is different from the storage device storing the second leaf node; a parent node referencing the first leaf node, the at least one copy of the first leaf node, the second leaf node, and the one or more copies of the second leaf node, the parent node stored on one of the plurality of storage devices; at least one copy of the parent node referencing the first leaf node, the at least one copy of the first leaf node, the second leaf node, and the one or more copies of the second leaf node, the at least one copy of the parent node stored on one of the plurality of storage devices that is different from the storage device storing the parent node; a grandparent node referencing the parent node and the at least one copy of the parent node, the grandparent node stored on one of the plurality of storage devices; at least one copy of the grandparent node referencing the parent node and the at least one copy of the parent node, the at least one copy of the grandparent node stored on one of the plurality of storage devices that is different from the storage device storing the grandparent node; and the grandparent node, the at least one copy of the grandparent node, the parent node, and the at least one copy of the parent node configured to index the first leaf node and the at least one copy of the first leaf node based on the first key and the second leaf node and the one or more copies of the second leaf node based on the second key; and maintaining at least as many copies of the parent node as the maximum number of copies of any of its children nodes, its children nodes including the first leaf node and the second leaf node; maintaining at least as many copies of the grandparent node as the maximum number of copies of any of its children nodes, its children nodes including the parent node; and traversing the tree structure, using a copy of the first leaf node if the first leaf node is unavailable, using a copy of the parent node if the parent node is unavailable, and using a copy of the grandparent node if the grandparent node is unavailable. - View Dependent Claims (12, 13, 14, 15)
-
Specification