Storage optimizing encoder and method
First Claim
1. A method of encoding data of a plurality of files in a data volume to optimize storage of the data volume on a computer readable recording medium, each file containing one whole file data stream, the method comprising the steps of:
- detecting whether any at least two separately identified files of the data volume contain whole file data streams that are identical; and
encoding the files for storing on the computer readable recording medium according to an encoding scheme in which said files containing identical whole file data streams are encoded as a single data stream.
2 Assignments
0 Petitions
Accused Products
Abstract
An encoder and method, such as for use in CD-ROM pre-mastering software, optimizes storage on a computer readable recording medium by eliminating redundant storage of identical data streams for duplicate files. The encoder and method detect whether two files have equivalent data streams, and encodes such duplicate files as a single data stream referenced by the respective directory entries of the files. In the illustrated embodiment, the encoder and method detect duplicate files based on file size and a cyclic redundancy check calculated on the file'"'"'s data stream or portion thereof.
-
Citations
31 Claims
-
1. A method of encoding data of a plurality of files in a data volume to optimize storage of the data volume on a computer readable recording medium, each file containing one whole file data stream, the method comprising the steps of:
-
detecting whether any at least two separately identified files of the data volume contain whole file data streams that are identical; and encoding the files for storing on the computer readable recording medium according to an encoding scheme in which said files containing identical whole file data streams are encoded as a single data stream. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method of encoding data of a plurality of files to optimize storage of the data in a data volume on a computer readable recording medium, each file containing one whole file data stream, the method comprising the steps of;
-
detecting whether any at least two of the files contain whole file data streams that are identical; and encoding the files for storing on the computer readable recording medium according to an encoding scheme in which said files containing identical whole file data streams are encoded as a single data stream; wherein the step of detecting comprises the steps of; determining a size of the whole file data stream of each file;
calculating a first result value as a function of an initial portion of the whole file data stream of each file;calculating a second result value as a function of the entire whole file data stream of each file; and searching for at least two of the files that have a same size and first and second result values.
-
-
8. A method of optimizing the storage of a data volume containing whole file data streams of a plurality of files and a directory structure with entries for the files, each file containing one whole file data stream, comprising the steps of:
-
detecting whether the whole file data streams of any two of the files are identical; for two files detected as having identical whole file data streams, performing the steps of; removing one of the identical whole file data streams; and
encoding the directory entries of the two files to reference the remaining one of the identical whole file data streams. - View Dependent Claims (9, 10, 11, 12)
-
-
13. A method of optimizing the storage of a data volume containing whole file data streams of a plurality of files and a directory structure with entries for the files, each file containing one whole file data stream, comprising the steps of:
-
detecting whether the whole file data streams of any two of the files are identical; for two files detected as having identical whole file data streams, performing the steps of; (a) removing one of the identical whole file data streams; and (b) encoding the directory entries of the two files to reference the remaining one of the identical whole file data streams; wherein the step of detecting comprises the steps of; determining a size of the whole file data stream of each file; calculating a first result value as a function of an initial portion of the whole file data stream of each file; calculating a second result value as a function of the entire whole file data stream of each file; and searching for at least two of the files that have a same size and first and second result values.
-
-
14. A computer readable recording medium containing a data volume encoded by a process for optimizing data storage of a plurality of files, each file containing one whole file data stream, the process comprising the steps of:
-
detecting whether any at least two separately identified files of the data volume contain whole file data streams that are identical; and encoding the files in the data volume according to an encoding scheme in which said files containing identical whole file data streams are encoded as a single data stream. - View Dependent Claims (15, 16, 17, 18, 19, 21)
-
-
20. A computer readable recording medium containing a data volume encoded by a process for optimizing data storage of a plurality of files, each file containing one whole file data stream, the process comprising the steps of:
-
detecting whether any at least two of the files contain whole file data streams that are identical; and encoding the files in the data volume according to an encoding scheme in which said files containing identical whole file data streams are encoded as a single data stream; wherein the step of detecting comprises the steps of; determining a size of the whole file data stream of each file; calculating a first result value as a function of an initial portion of the whole file data stream of each file; calculating a second result value as a function of the entire whole file data stream of each file; and searching for at least two of the files that have a same size and first and second result values.
-
-
22. An encoder for optimizing storage of a plurality of files in a data volume on a computer readable recording medium, each file containing one whole file data stream, the encoder comprising:
-
a detector for detecting whether any at least two separately identified files of the data volume contain whole file data streams that are identical; means responsive to detection of at least two files containing identical whole file data streams for encoding said two files as a single data stream in the data volume. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29)
-
-
30. In a pre-mastering software application on a computer, a method of optimizing storage of a plurality of files in a data volume on a master recording medium, each file having data stream contents, the method comprising:
-
adding files one-by-one to the data volume; for each file added to the data volume, calculating a check value as a function of an initial portion of the currently added file'"'"'s data stream contents; determining a size of the currently added file'"'"'s data stream contents; searching in a table of the check value and size of each of the files previously added to the data volume for a potential duplicate file having a matching check value and size; if a potential duplicate file is found, verifying via a bit-for-bit comparison that the currently added file is identical to the potential duplicate file; if the currently added file is verified to be identical, encoding the currently added file in the data volume as a directory entry that separately identifies the currently added file from the identical file and points to a same data stream contents as a directory entry of the identical file; and if not verified to be identical to any file previously added to the data volume, then (a) updating the table with the currently added file'"'"'s check value and size, and (b) encoding the data stream contents of the currently added file and a directory entry that separately identifies the currently added file from previously added files and points to the encoded data stream contents of the currently added file in the data volume.
-
-
31. In a pre-mastering software application on a computer, a method of optimizing storage of a plurality of files in a data volume on a master recording medium, each file having data stream contents, the method comprising:
-
adding files one-by-one to the data volume; for each file added to the data volume, calculating a first check value as a function of an initial portion of the currently added file'"'"'s data stream contents; calculating a second check value as a function of an entirety of the currently added file'"'"'s data stream contents; determining a size of the currently added file'"'"'s data stream contents; searching in a table of the first and second check values and size of each of the files previously added to the data volume for a duplicate file having first and second check values and size matching those of the currently added file; if such duplicate file is found, encoding the currently added file in the data volume as a directory entry that separately identifies the currently added file from the duplicate file and points to a same data stream contents as a directory entry of the duplicate file; and if no duplicate file is found, then (a) updating the table with the currently added file'"'"'s check value and size, and (b) encoding the data stream contents of the currently added file and a directory entry that separately identifies the currently added file from previously added files and points to the encoded data stream contents of the currently added file in the data volume.
-
Specification