Dedicated client-side signature generator in a networked storage system
First Claim
1. A method of maintaining a signature repository accessible by multiple client computing devices in a data storage system, the method comprising:
- tracking storage of a plurality of data units in a primary storage subsystem, the plurality of tracked data units corresponding to primary data generated by one or more applications executing on a plurality of client computing devices that form the primary storage subsystem, each data unit of the plurality of tracked data units forming at least a portion of at least one file stored in the primary storage subsystem,the primary data for each of the client computing devices stored in a primary data store associated with a respective client computing device,the primary storage subsystem in communication with a secondary storage subsystem that is separate from the primary storage subsystem and is configured to maintain secondary copies of at least some of the primary data;
generating, by a signature agent executing on one or more processors in the primary storage subsystem, signatures corresponding to the plurality of tracked data units; and
maintaining a signature repository including a signature block for at least each unique signature of the generated signatures, where each signature block comprises;
the unique signature; and
one or more data unit entries, each entry corresponding to a distinct data unit of the plurality of tracked data units and associated with the unique signature that is stored in the primary storage subsystem and that is generated by an application of the applications executing on a distinct client computing device, wherein each entry identifies a client computing device of the plurality of client computing devices that stores the corresponding distinct data unit, wherein at least one of the signature blocks includes at least a first entry indicating that a first primary data store associated with a first client computing device of the plurality of client computing devices stores a first data unit and that is associated with the unique signature and a second entry indicating that a second primary data store associated with a second client computing device of the plurality of client computing devices stores a second data unit that is associated with the unique signature,wherein the first data unit forms at least a portion of a first file stored in the first primary data store and the second data unit forms at least a portion of a second file stored in the second primary data store.
4 Assignments
0 Petitions
Accused Products
Abstract
A storage system according to certain embodiments includes a client-side signature repository that includes information representative of a set of data blocks stored in primary storage. During storage operations of a client, the system can generate signatures corresponding to data blocks that are being stored in primary storage. The system can store the generated signatures in the client-side signature repository along with information regarding the location of the corresponding data block within primary storage. As additional instances of the data block are stored in primary storage, the system can store the location of the additional instances in the client-side signature repository.
-
Citations
21 Claims
-
1. A method of maintaining a signature repository accessible by multiple client computing devices in a data storage system, the method comprising:
-
tracking storage of a plurality of data units in a primary storage subsystem, the plurality of tracked data units corresponding to primary data generated by one or more applications executing on a plurality of client computing devices that form the primary storage subsystem, each data unit of the plurality of tracked data units forming at least a portion of at least one file stored in the primary storage subsystem, the primary data for each of the client computing devices stored in a primary data store associated with a respective client computing device, the primary storage subsystem in communication with a secondary storage subsystem that is separate from the primary storage subsystem and is configured to maintain secondary copies of at least some of the primary data; generating, by a signature agent executing on one or more processors in the primary storage subsystem, signatures corresponding to the plurality of tracked data units; and maintaining a signature repository including a signature block for at least each unique signature of the generated signatures, where each signature block comprises; the unique signature; and one or more data unit entries, each entry corresponding to a distinct data unit of the plurality of tracked data units and associated with the unique signature that is stored in the primary storage subsystem and that is generated by an application of the applications executing on a distinct client computing device, wherein each entry identifies a client computing device of the plurality of client computing devices that stores the corresponding distinct data unit, wherein at least one of the signature blocks includes at least a first entry indicating that a first primary data store associated with a first client computing device of the plurality of client computing devices stores a first data unit and that is associated with the unique signature and a second entry indicating that a second primary data store associated with a second client computing device of the plurality of client computing devices stores a second data unit that is associated with the unique signature, wherein the first data unit forms at least a portion of a first file stored in the first primary data store and the second data unit forms at least a portion of a second file stored in the second primary data store. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A storage system, comprising:
-
a signature repository agent executing on one or more processors in a primary storage subsystem, the primary storage subsystem comprising; a plurality of client computing devices; and a plurality of data agents executing on the plurality of client computing devices, the plurality of data agents configured to track storage of a plurality of data units in the primary storage subsystem, the plurality of data units corresponding to primary data generated by one or more applications executing on the plurality of client computing devices, each data unit forming at least a portion of at least on file stored in the primary storage subsystem, the primary data for each of the client computing devices stored in a primary data store associated with a respective client computing device, and the primary storage subsystem in communication with a secondary storage subsystem that is separate from the primary storage subsystem and that is configured to maintain secondary copies of at least some of the primary data, and wherein the signature repository agent is configured to maintain a signature repository including a signature block for at least each unique signature generated by one or more signature agents, each signature block comprising; the unique signature; and one or more data unit entries, each entry corresponding to a distinct data unit associated with the unique signature that is stored in the primary storage subsystem and that is generated by an application of the applications executing on a distinct client computing device, wherein each entry identifies a client computing device of the plurality of client computing devices that stores the corresponding distinct data unit, wherein at least one of the signature blocks includes at least a first entry indicating that a first primary data store associated with a first client computing device of the plurality of client computing devices stores a first data unit that is associated with the unique signature and a second entry indicating that a second primary data store associated with a second client computing device of the plurality of client computing devices stores a second data unit that is associated with the unique signature, wherein the first data unit forms at least a portion of a first file stored in the first primary data store and the second data unit forms at least a portion of a second file stored in the second primary data store. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A computer-readable, non-transitory storage medium having one or more computer-executable modules for maintaining a signature repository accessible by multiple client computing devices in a data storage system, the one or more computer-executable modules comprising:
-
a first module in communication with a plurality of client computing devices that form a primary storage subsystem, the primary storage subsystem comprising; the plurality of client computing devices; and a plurality of data agents executing on the plurality of client computing devices, the plurality of data agents configured to track storage of a plurality of data units in the primary storage subsystem, the plurality of data units corresponding to primary data generated by one or more applications executing on the plurality of client computing devices, each data unit forming at least a portion of at least one file stored in the primary storage subsystem, the primary data for each of the client computing devices stored in a data store associated with a respective client computing device, and the primary storage subsystem in communication with a secondary storage subsystem that is separate from the primary storage subsystem and is configured to maintain secondary copies of at least some of the primary data, wherein the first module is configured to maintain a signature repository including a signature block for at least each unique signature associated with the plurality of data units, where each signature block comprises; the unique signature; and one or more data unit entries, each entry corresponding to a distinct data unit associated with the unique signature that is stored in the primary storage subsystem and that is generated by an application of the applications executing on a distinct client computing device, wherein each entry identifies a client computing device of the plurality of client computing devices that stores the corresponding distinct data unit, wherein at least one of the signature blocks includes at least a first entry indicating that a first primary data store associated with a first client computing device of the plurality of client computing devices stores a first data unit that is associated with the unique signature and a second entry indicating that a second primary data store associated with a second client computing device of the plurality of client computing devices stores a second data unit that is associated with the unique signature.
-
Specification