Systems and methods for providing increased scalability in deduplication storage systems
First Claim
1. A computer-implemented method for providing increased scalability in deduplication storage systems, at least a portion of the method being performed by a computing device comprising at least one processor, the method comprising:
- identifying a database that stores a plurality of reference objects, wherein each reference object within the database comprises a reference list that identifies one or more backed-up files that currently reference a particular unique file segment stored in a deduplication storage system;
determining that the size of the entire database as a whole has reached a predetermined threshold;
in response to determining that the size of the entire database as a whole has reached the predetermined threshold;
partitioning the database into a plurality of sub-databases capable of being updated independent of one another, the plurality of sub-databases comprising an inactive sub-database that is empty after the time of the partition;
designating the inactive sub-database within the plurality of sub-databases as an active sub-database for storing reference objects created after the time of the designation;
identifying a request to perform an update operation that updates one or more reference objects stored within the active sub-database;
performing the update operation only on the active sub-database to avoid processing costs associated with performing the update operation on all of the sub-databases within the plurality of sub-databases, wherein the update operation comprises adding a reference that identifies a particular backed-up file to one or more reference lists stored within the active sub-data base.
7 Assignments
0 Petitions
Accused Products
Abstract
A computer-implemented method for providing increased scalability in deduplication storage systems may include (1) identifying a database that stores a plurality of reference objects, (2) determining that at least one size-related characteristic of the database has reached a predetermined threshold, (3) partitioning the database into a plurality of sub-databases capable of being updated independent of one another, (4) identifying a request to perform an update operation that updates one or more reference objects stored within at least one sub-database, and then (5) performing the update operation on less than all of the sub-databases to avoid processing costs associated with performing the update operation on all of the sub-databases. Various other systems, methods, and computer-readable media are also disclosed.
13 Citations
20 Claims
-
1. A computer-implemented method for providing increased scalability in deduplication storage systems, at least a portion of the method being performed by a computing device comprising at least one processor, the method comprising:
-
identifying a database that stores a plurality of reference objects, wherein each reference object within the database comprises a reference list that identifies one or more backed-up files that currently reference a particular unique file segment stored in a deduplication storage system; determining that the size of the entire database as a whole has reached a predetermined threshold; in response to determining that the size of the entire database as a whole has reached the predetermined threshold; partitioning the database into a plurality of sub-databases capable of being updated independent of one another, the plurality of sub-databases comprising an inactive sub-database that is empty after the time of the partition; designating the inactive sub-database within the plurality of sub-databases as an active sub-database for storing reference objects created after the time of the designation; identifying a request to perform an update operation that updates one or more reference objects stored within the active sub-database; performing the update operation only on the active sub-database to avoid processing costs associated with performing the update operation on all of the sub-databases within the plurality of sub-databases, wherein the update operation comprises adding a reference that identifies a particular backed-up file to one or more reference lists stored within the active sub-data base. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system for providing increased scalability in deduplication storage systems, the system comprising:
-
an identification module, stored in memory, that identifies a database that stores a plurality of reference objects, wherein each reference object within the database comprises a reference list that identifies one or more backed-up files that currently reference a particular unique file segment stored in a deduplication storage system; a partitioning module, stored in memory, that; determines that the size of the entire database as a whole has reached a predetermined threshold; in response to determining that the size of the entire database as a whole has reached the predetermined threshold; partitions the database into a plurality of sub-databases capable of being updated independent of one another, the plurality of sub-databases comprising an inactive sub-database that is empty after the time of the partition; designates the inactive sub-database within the plurality of sub-databases as an active sub-database for storing reference objects created after the time of the designation; an update module, stored in memory, that; identifies a request to perform an update operation that updates one or more reference objects stored within the active sub-database; performs the update operation only on the active sub-database to avoid processing costs associated with performing the update operation on all of the sub-databases within the plurality of sub-databases, wherein the update operation comprises adding a reference that identifies a particular backed-up file to one or more reference lists stored within the active sub-database; at least one physical processor that executes the identification module, the partitioning module, and the update module. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A non-transitory computer-readable-storage medium comprising one or more computer-executable instructions that, when executed by at least one processor of a computing device, cause the computing device to:
-
identify a database that stores a plurality of reference objects, wherein each reference object within the database comprises a reference list that identifies one or more backed-up files that currently reference a particular unique file segment stored in a deduplication storage system; determine that the size of the entire database as a whole has reached a predetermined threshold; in response to determining that the size of the entire database as a whole has reached the predetermined threshold; partition the database into a plurality of sub-databases capable of being updated independent of one another, the plurality of sub-databases comprising an inactive sub-database that is empty after the time of the partition; designate the inactive sub-database within the plurality of sub-databases as an active sub-database for storing reference objects created after the time of the designation; identify a request to perform an update operation that updates one or more reference objects stored within the active sub-database; perform the update operation only on the active sub-database to avoid processing costs associated with performing the update operation on all of the sub-databases within the plurality of sub-databases, wherein the update operation comprises adding a reference that identifies a particular backed-up file to one or more reference lists stored within the active sub-database.
-
Specification