Hashing objects into multiple directories for better concurrency and manageability
First Claim
1. A method of providing increased concurrency among information transfer operations performed by one or more of a plurality of executable applications operating in an object-based data storage system, said method comprising:
- forming an index object that points to a plurality of component objects, wherein said index object is concurrently accessed by one or more of said plurality of executable applications, wherein each component object contains a portion of information managed by said index object, and wherein forming said index object includes;
identifying a directory object in said data storage system that requires said increased concurrency among said information transfer operations performed thereon, wherein a content of said directory object constitutes a first plurality of entries,dividing said content of said directory object into said plurality of component objects, wherein said first plurality of entries is divided among said plurality of component objects with each component object storing a respective non-overlapping portion of said first plurality of entries, andcreating said index object containing a second plurality of entries, wherein each of said second plurality of entries points to a different one of said plurality of component objects and identifies said component object pointed to;
using a mapping function per-access basis to determine which of said plurality of component objects is to be accessed by a corresponding one of said plurality of executable applications; and
configuring each of said plurality of executable applications to access on per-access basis only that component object which is determined using said mapping function for respective information transfer operations;
wherein each file manager in said data storage system manages a portion of said information transfer operations on a corresponding component object without coordination with other file managers in said data storage system during each access to said index object.
8 Assignments
0 Petitions
Accused Products
Abstract
A data storage methodology wherein a hashing algorithm is applied to break a directory object experiencing frequent concurrent accesses from a number of client or manager applications into a predetermined number of hash component objects and a hash master object that manages the component objects. The hash master object and the hash components, together, constitute a hash directory, which replaces the original non-hashed directory object. Each hash component object contains a portion of the entries contained in the original directory object. Each hash component is managed by only one file manager. The entries in the original directory object are distributed among the hash component objects using a predefined hashing algorithm. The creation of hash components and the hash master allows more than one client application or file manager to concurrently write corresponding hash components without the need for access coordination on each access.
131 Citations
12 Claims
-
1. A method of providing increased concurrency among information transfer operations performed by one or more of a plurality of executable applications operating in an object-based data storage system, said method comprising:
-
forming an index object that points to a plurality of component objects, wherein said index object is concurrently accessed by one or more of said plurality of executable applications, wherein each component object contains a portion of information managed by said index object, and wherein forming said index object includes; identifying a directory object in said data storage system that requires said increased concurrency among said information transfer operations performed thereon, wherein a content of said directory object constitutes a first plurality of entries, dividing said content of said directory object into said plurality of component objects, wherein said first plurality of entries is divided among said plurality of component objects with each component object storing a respective non-overlapping portion of said first plurality of entries, and creating said index object containing a second plurality of entries, wherein each of said second plurality of entries points to a different one of said plurality of component objects and identifies said component object pointed to; using a mapping function per-access basis to determine which of said plurality of component objects is to be accessed by a corresponding one of said plurality of executable applications; and configuring each of said plurality of executable applications to access on per-access basis only that component object which is determined using said mapping function for respective information transfer operations; wherein each file manager in said data storage system manages a portion of said information transfer operations on a corresponding component object without coordination with other file managers in said data storage system during each access to said index object. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method of providing increased concurrency among information transfer operations performed by one or more of a plurality of executable applications operating in an object-based data storage system, said method comprising:
-
forming an index object that points to a plurality of component objects, wherein said index object is concurrently accessed by one or more of said plurality of executable applications, wherein each component object contains a portion of information managed by said index object, and wherein forming said index object includes; identifying a directory object in said data storage system that requires said increased concurrency among said information transfer operations performed thereon, wherein a content of said directory object constitutes a first plurality of entries, dividing said content of said directory object into said plurality of component objects, wherein said first plurality of entries is divided among said plurality of component objects with each component object storing a respective non-overlapping portion of said first plurality of entries, and creating said index object containing a second plurality of entries, wherein each of said second plurality of entries points to a different one of said plurality of component objects and identifies said component object pointed to; using a mapping function per-access basis to determine which of said plurality of component objects is to be accessed by a corresponding one of said plurality of executable applications; and configuring each of said plurality of executable applications to access on per-access basis only that component object which is determined using said mapping function for respective information transfer operations; wherein an object corresponding to one of said first plurality of entries contains the following attribute values; a first attribute value identifying an entity in said data storage system responsible for storing said object; a second attribute value identifying an object group containing said object; and a third attribute value identifying said object.
-
-
8. A method of providing increased concurrency among information transfer operations performed by one or more of a plurality of executable applications operating in an object-based data storage system, said method comprising:
-
forming an index object that points to a plurality of component objects, wherein said index object is concurrently accessed by one or more of said plurality of executable applications, wherein each component object contains a portion of information managed by said index object, and wherein forming said index object includes; identifying a directory object in said data storage system that requires said increased concurrency among said information transfer operations performed thereon, wherein a content of said directory object constitutes a first plurality of entries, dividing said content of said directory object into said plurality of component objects, wherein said first plurality of entries is divided among said plurality of component objects with each component object storing a respective non-overlapping portion of said first plurality of entries, and creating said index object containing a second plurality of entries, wherein each of said second plurality of entries points to a different one of said plurality of component objects and identifies said component object pointed to; using a mapping function per-access basis to determine which of said plurality of component objects is to be accessed by a corresponding one of said plurality of executable applications; and configuring each of said plurality of executable applications to access on per-access basis only that component object which is determined using said mapping function for respective information transfer operations; wherein dividing said content of said directory object includes; assigning a numerical identifier to each of said plurality of component objects; identifying a unique portion of a file path name for each corresponding object in said first plurality of entries; applying said mapping function to each said unique portion, thereby generating a corresponding integer value for each said unique portion; computing a modulus of each said corresponding integer value over the total number of said component objects in said plurality of component objects, thereby generating a corresponding storage integer whose value is less than or equal to said total number; and storing each of said first plurality of entries into that one of said plurality of component objects whose numerical identifier is equal to said corresponding storage integer for said entry to be stored.
-
-
9. A method of providing increased concurrency among information transfer operations performed by one or more of a plurality of executable applications operating in an object-based data storage system, said method comprising:
-
forming an index object that points to a plurality of component objects, wherein said index object is concurrently accessed by one or more of said plurality of executable applications, wherein each component object contains a portion of information managed by said index object, and wherein forming said index object includes; identifying a directory object in said data storage system that requires said increased concurrency among said information transfer operations performed thereon, wherein a content of said directory object constitutes a first plurality of entries, dividing said content of said directory object into said plurality of component objects, wherein said first plurality of entries is divided among said plurality of component objects with each component object storing a respective non-overlapping portion of said first plurality of entries, and creating said index object containing a second plurality of entries, wherein each of said second plurality of entries points to a different one of said plurality of component objects and identifies said component object pointed to; using a mapping function per-access basis to determine which of said plurality of component objects is to be accessed by a corresponding one of said plurality of executable applications; and configuring each of said plurality of executable applications to access on per-access basis only that component object which is determined using said mapping function for respective information transfer operations; wherein each of said plurality of component objects includes at least one of the following; an indication distinguishing said component object from said index object; a first information identifying said mapping function; a second information identifying a number assigned to said component object; and a third information identifying the total number of component objects in said plurality of component objects.
-
-
10. A method of providing increased concurrency among information transfer operations performed by one or more of a plurality of executable applications operating in an object-based data storage system, said method comprising:
-
forming an index object that points to a plurality of component objects, wherein said index object is concurrently accessed by one or more of said plurality of executable applications, wherein each component object contains a portion of information managed by said index object, and wherein forming said index object includes; identifying a directory object in said data storage system that requires said increased concurrency among said information transfer operations performed thereon, wherein a content of said directory object constitutes a first plurality of entries, dividing said content of said directory object into said plurality of component objects, wherein said first plurality of entries is divided among said plurality of component objects with each component object storing a respective non-overlapping portion of said first plurality of entries, and creating said index object containing a second plurality of entries, wherein each of said second plurality of entries points to a different one of said plurality of component objects and identifies said component object pointed to; using a mapping function per-access basis to determine which of said plurality of component objects is to be accessed by a corresponding one of said plurality of executable applications; and configuring each of said plurality of executable applications to access on per-access basis only that component object which is determined using said mapping function for respective information transfer operations; wherein said index object includes at least one of the following; an indication distinguishing said index object from each of said plurality of component objects; a first information identifying the total number of component objects in said plurality of component objects; a second information identifying an encoding scheme for said mapping function; and an access control list (ACL) identifying a group of principals in said data storage system authorized to access said plurality of component objects through said index object.
-
-
11. A computer-readable storage medium containing a program code, which, upon execution by a processor in an object-based distributed data storage system, causes said processor to perform the following:
-
form an index object that points to a plurality of component objects, wherein said index object is configured to be concurrently read by one or more of a plurality of executable applications operating in said data storage system, and wherein each component object contains a portion of information managed by said index object; use a mapping function to determine which of said plurality of component objects is to be accessed by a corresponding one of said plurality of executable applications; and configure each of said plurality of executable applications to access only that component object which is determined using said mapping function; configure each file manager in said data storage system to manage a portion of information transfer operations on a corresponding component object without coordination with other file managers in said data storage system during each access to said index object.
-
-
12. An object-based data storage system providing increased concurrency among information transfer operations performed by one or more of a plurality of executable applications operating in said data storage system, said data storage system comprising:
-
means for forming an index object that points to a plurality of component objects, wherein said index object is concurrently read by one or more of said plurality of executable applications, and wherein each component object contains a portion of information managed by said index object; means for using a mapping function to determine which of said plurality of component objects is to be accessed by a corresponding one of said plurality of executable applications; and means for configuring each of said plurality of executable applications to access only that component object which is determined using said mapping function for respective information transfer operations; means for configuring each file manager in said data storage system to manage a portion of information transfer operations on a corresponding component object without coordination with other file managers in said data storage system during each access to said index object.
-
Specification