Fine-grained shared multi-tenant de-duplication system
First Claim
Patent Images
1. A method, comprising:
- detecting a data stream that is generated in part by one or more applications and is specific to a first user and that includes a first user object having one or more data chunks;
salting data associated with the first user, wherein salting data comprises;
associating a user-specific salt with the data chunks of the first user object to form user-specific combinations of the form ((user-specific salt)+(first user data chunk)); and
hashing the ((user-specific salt)+(first user data chunk)) combinations to form first user-specific chunk hashes;
storing the user-specific chunk hashes in a container;
reducing an amount of user-specific data to be stored by performing user-level deduplication in a shared multi-tenant deduplication system to eliminate duplicate user-specific data, wherein performing user-level deduplication comprises de-duplicating only data associated with the first user by performing the following;
comparing a first user-specific chunk hash with a second user-specific chunk hash that is associated with the same user as the first user-specific chunk hash;
when the second user-specific chunk hash is the same as the first user-specific chunk hash, discarding the second user-specific chunk hash;
based on results of the deduplication process, determining a storage capacity consumed by objects of the first user; and
storing the user-specific data remaining after deduplication has been performed.
13 Assignments
0 Petitions
Accused Products
Abstract
In one example, a method for managing data includes detecting a data stream that is specific to a first user and that includes one or more user objects each having one or more data chunks. Next, the data associated with the first user is salted by associating a user-specific salt with the data chunks of the one or more user objects to form user-specific combinations of the form ((user-specific salt)+(user data chunk)). Finally, an amount of storage capacity consumed by the one or more user objects is determined.
19 Citations
17 Claims
-
1. A method, comprising:
-
detecting a data stream that is generated in part by one or more applications and is specific to a first user and that includes a first user object having one or more data chunks; salting data associated with the first user, wherein salting data comprises; associating a user-specific salt with the data chunks of the first user object to form user-specific combinations of the form ((user-specific salt)+(first user data chunk)); and hashing the ((user-specific salt)+(first user data chunk)) combinations to form first user-specific chunk hashes; storing the user-specific chunk hashes in a container; reducing an amount of user-specific data to be stored by performing user-level deduplication in a shared multi-tenant deduplication system to eliminate duplicate user-specific data, wherein performing user-level deduplication comprises de-duplicating only data associated with the first user by performing the following; comparing a first user-specific chunk hash with a second user-specific chunk hash that is associated with the same user as the first user-specific chunk hash; when the second user-specific chunk hash is the same as the first user-specific chunk hash, discarding the second user-specific chunk hash; based on results of the deduplication process, determining a storage capacity consumed by objects of the first user; and storing the user-specific data remaining after deduplication has been performed. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A non-transitory storage medium having stored therein computer-executable instructions which, when executed by one or more hardware processors, perform the following operations:
-
detecting a data stream that is generated by one or more applications and is specific to a first user and that includes a first user object having one or more data chunks; salting data associated with the first user, wherein salting data comprises; associating a user-specific salt with the data chunks of the first user object to form user-specific combinations of the form ((user-specific salt)+(first user data chunk)); and hashing the ((user-specific salt)+(first user data chunk)) combinations to form first user-specific chunk hashes; storing the user-specific chunk hashes in a container; reducing an amount of user-specific data to be stored by performing user-level deduplication in a shared multi-tenant deduplication system to eliminate duplicate user-specific data, wherein performing user-level deduplication comprises de-duplicating only data associated with the first user by performing the following; comparing a first user-specific chunk hash with a second user-specific chunk hash that is associated with the same user as the first user-specific chunk hash; when the second user-specific chunk hash is the same as the first user-specific chunk hash, discarding the second user-specific chunk hash; based on results of the deduplication process, determining a storage capacity consumed by objects of the first user; and storing the user-specific data remaining after deduplication has been performed.
-
-
11. A non-transitory storage medium having stored therein computer-executable instructions which, when executed by one or more hardware processors, perform the following operations:
-
detecting a data stream that is generated in part by one or more applications and is specific to a first user and that includes one or more user objects each having one or more data chunks; salting data associated with the first user, wherein salting data comprises associating a user-specific salt with the data chunks of the one or more user objects to form user-specific combinations of the form ((user-specific salt)+(user data chunk)); using the salted data as a basis for accurately determining, on a user basis, an amount of storage capacity consumed by the one or more user objects; and using information about the consumed storage capacity to facilitate performance of another process relating to the consumption of data storage. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
Specification