COMMONALITY FACTORING FOR REMOVABLE MEDIA
First Claim
1. An apparatus for storing data with a removable medium, the apparatus comprising:
- a chunk module configured to receive an original data stream and break the original data stream into chunks;
a hash module coupled to the chunk module and configured to calculate an identifier for each chunk and to store the identifiers; and
a search module coupled to the removable medium, wherein the search module is configured to;
determine, based on the identifiers, whether each chunk is unique, and store on the removable medium, at least two of a following;
the unique chunks,descriptors describing each of the unique chunks, andreferences indicating a sequence of the unique chunks usable to reconstruct the original data stream, wherein the removable medium comprises a drive cartridge.
3 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods for commonality factoring for storing data on removable storage media are described. The systems and methods allow for highly compressed data, e.g., data compressed using archiving or backup methods including de-duplication, to be stored in an efficient manner on portable memory devices such as removable storage cartridges. The methods include breaking data, e.g., data files for backup, into unique chunks and calculating identifiers, e.g., hash identifiers, based on the unique chunks. Redundant chunks can be identified by calculating identifiers and comparing identifiers of other chunks to the identifiers of unique chunks previously calculated. When a redundant chunk is identified, a reference to the existing unique chunk is generated such that the chunk can be reconstituted in relation to other chunks in order to recreate the original data. The method further includes storing one or more of the unique chunks, the identifiers and/or the references on the removable storage medium.
187 Citations
19 Claims
-
1. An apparatus for storing data with a removable medium, the apparatus comprising:
-
a chunk module configured to receive an original data stream and break the original data stream into chunks; a hash module coupled to the chunk module and configured to calculate an identifier for each chunk and to store the identifiers; and a search module coupled to the removable medium, wherein the search module is configured to; determine, based on the identifiers, whether each chunk is unique, and store on the removable medium, at least two of a following; the unique chunks, descriptors describing each of the unique chunks, and references indicating a sequence of the unique chunks usable to reconstruct the original data stream, wherein the removable medium comprises a drive cartridge. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method for storing data with a plurality of removable media, comprising steps of:
-
breaking an original data stream into chunks; calculating an identifier for each chunk; storing the identifiers; determining, based on the identifiers, whether each chunk is unique; and storing on the plurality of removable media, at least two of a following; a stream of the unique chunks, a stream of descriptors describing each of the unique chunks, and a stream of references indicating a sequence of the unique chunks usable to reconstruct the original data stream, wherein; the stream of references stored in each medium includes a medium identifier, and the medium identifier allows correlation of the references to the chunks on the different removable media. - View Dependent Claims (12, 19)
-
-
13. A method for storing data on a removable storage unit, comprising:
-
breaking original data into chunks; calculating an identifier for each chunk; storing the identifiers in a low latency memory; determining, based on the identifiers, whether each chunk is unique; and storing on the storage unit at least two of a following; the unique chunks, descriptors describing each of the unique chunks, and references indicating a sequence of the unique chunks used to reconstruct the original data. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification