Method of creating hierarchical indices for a distributed object system
First Claim
1. A data structuring method operative in a data management system organized into one two or more physically-dispersed regions, with each region comprising one or more clusters, and wherein a given cluster includes one or more nodes and a shared storage, and wherein the nodes receive data streams continuously and store such data streams in an object-oriented data store, comprising:
- for a defined object property, generating maintaining an index tree for use in locating determining where a given object in the data management system is located, the index tree comprising a root, one or more levels of joins, and a plurality of leaves, wherein each leaf is associated with a sorted structure, a join above one or more leaves aggregates leaves that are in a given cluster, a join on a next level up in the index tree aggregates the joins of multiple clusters that belong to a given region, and a join on a next level up in the index tree aggregates the joins of multiple regions that belong to a given universe at the root of the index tree;
associating a key and a key value with each sorted structure in each leaf and with each join in the index tree;
wherein the key is a hash key that is generated by applying a Bloom Filter to at least a current key of a given sorted structure;
responsive to the modification of the given sorted structure;
re-computing the key associated with the given sorted structure;
re-computing the keys of one or more joins in the index tree;
propagating a given cluster key value from a first cluster to a second cluster; and
at the second cluster, updating the index tree based on the given cluster key value propagated from the first cluster, wherein the updating includes re-computing at least one key value;
responsive to a given occurrence, updating the index tree by re-computing at least one key value; and
responsive to a search request for the given object;
performing a membership test on at least one key in the index tree to identify which of the multiple clusters may have the given object; and
using a sorted structure to locate the given object within a given cluster.
24 Assignments
0 Petitions
Accused Products
Abstract
A data management system or “DMS” provides data services to data sources associated with a set of application host servers. The data management system typically comprises one or more regions, with each region having one or more clusters. A given cluster has one or more nodes that share storage. When providing continuous data protection and data distribution, the DMS nodes create distributed object storage to provide the necessary real-time data management services. The objects created by the DMS nodes are so-called active objects. The distributed object store can be built above raw storage devices, a traditional file system, a special purpose file system, a clustered file system, a database, and so on. According to the present invention, the DMS active object store provides an indexing service to the active objects. In an illustrative embodiment, any object property that has a given attribute is indexed and, as a result, the attribute becomes searchable. The DMS provides hierarchical distributed indexing using index trees to facilitate searching in a highly efficient manner.
291 Citations
11 Claims
-
1. A data structuring method operative in a data management system organized into one two or more physically-dispersed regions, with each region comprising one or more clusters, and wherein a given cluster includes one or more nodes and a shared storage, and wherein the nodes receive data streams continuously and store such data streams in an object-oriented data store, comprising:
-
for a defined object property, generating maintaining an index tree for use in locating determining where a given object in the data management system is located, the index tree comprising a root, one or more levels of joins, and a plurality of leaves, wherein each leaf is associated with a sorted structure, a join above one or more leaves aggregates leaves that are in a given cluster, a join on a next level up in the index tree aggregates the joins of multiple clusters that belong to a given region, and a join on a next level up in the index tree aggregates the joins of multiple regions that belong to a given universe at the root of the index tree; associating a key and a key value with each sorted structure in each leaf and with each join in the index tree; wherein the key is a hash key that is generated by applying a Bloom Filter to at least a current key of a given sorted structure; responsive to the modification of the given sorted structure; re-computing the key associated with the given sorted structure; re-computing the keys of one or more joins in the index tree; propagating a given cluster key value from a first cluster to a second cluster; and at the second cluster, updating the index tree based on the given cluster key value propagated from the first cluster, wherein the updating includes re-computing at least one key value; responsive to a given occurrence, updating the index tree by re-computing at least one key value; and responsive to a search request for the given object; performing a membership test on at least one key in the index tree to identify which of the multiple clusters may have the given object; and using a sorted structure to locate the given object within a given cluster. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
Specification