File system level compression using holes
First Claim
1. A computer-implemented method of storing data in a file system having a mapping table arranged to map logical memory blocks to physical memory blocks, the method comprising the steps of:
- requesting that a segment of data be written to physical memory, the segment of data being associated with selected logical memory blocks;
compressing the segment of data into compressed data such that the compressed data occupies fewer blocks of memory than the segment of data;
writing the compressed data to physical memory, the compressed data being written to at least one physical memory block; and
updating the mapping table such that each physical memory block associated with the compressed data is mapped to by a mapping table entry corresponding to one of the selected logical memory blocks and each of the selected logical memory blocks that is not associated with any of the physical memory blocks is mapped to a hole identifier that does not correspond to any physical memory block.
2 Assignments
0 Petitions
Accused Products
Abstract
A method, apparatus, and computer-usable medium for compressing data in a file system utilizing the concept of "holes". A mapping table in a file system maps the logical blocks of a file to actual physical blocks on disk where the data is stored. Blocks may be arranged in units of a cluster, and the file may be compressed cluster-by-cluster. Holes are used within a cluster to indicate not only that a cluster has been compressed, but also the compression algorithm used. Different clusters within a file may be compressed with different compression algorithms. A unit of data is compressed, with the result that the file occupies fewer physical blocks than it has logical blocks. The mapping table is updated to indicate that for a given unit of data compressed, fewer physical blocks are needed. Certain logical blocks belonging to this unit of data are not mapped to physical blocks but are mapped to a hole. A hole indicates that the unit of data was compressed, and may also indicate the particular compression algorithm used to compress the unit of data. If a unit of data begins or ends within the middle of a cluster, to avoid overwriting the data not to be changed the whole cluster must first be read from disk. If a hole indicates the cluster had been compressed, the data must be expanded first. The cluster is read into a buffer and the portion to be changed is overwritten. The cluster is compressed and written back to disk. Those clusters within which the unit of data neither begins nor ends may be written to directly.
271 Citations
28 Claims
-
1. A computer-implemented method of storing data in a file system having a mapping table arranged to map logical memory blocks to physical memory blocks, the method comprising the steps of:
-
requesting that a segment of data be written to physical memory, the segment of data being associated with selected logical memory blocks; compressing the segment of data into compressed data such that the compressed data occupies fewer blocks of memory than the segment of data; writing the compressed data to physical memory, the compressed data being written to at least one physical memory block; and updating the mapping table such that each physical memory block associated with the compressed data is mapped to by a mapping table entry corresponding to one of the selected logical memory blocks and each of the selected logical memory blocks that is not associated with any of the physical memory blocks is mapped to a hole identifier that does not correspond to any physical memory block. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer-implemented method of storing data in a file system having a mapping table arranged to map logical memory blocks to physical memory blocks, the file system being arranged to process clusters, each cluster corresponding to a plurality of logical memory blocks, the method comprising the steps of:
-
requesting that a segment of data be written to physical memory, the segment of data being associated with selected logical memory blocks, wherein the selected logical memory blocks may span a plurality of clusters; and determining whether the segment of data begins at an intermediate location within a first one of the clusters, wherein when it is determined that the segment of data begins at an intermediate location, the method further comprises the step of performing a partial cluster write operation utilizing the first cluster as a current cluster, the partial cluster write operation including the sub-steps of, a) reading data associated with the current cluster from physical memory into a read buffer, such current cluster data being stored in the read buffer in an expanded format, b) copying a portion of the segment of data associated with the current cluster into the read buffer after the reading step has been completed, c) compressing the data stored in the read buffer after the copying step, d) writing the compressed data to physical memory, and e) updating the mapping table such that each physical memory block associated with the compressed data is mapped to by a mapping table entry corresponding to one of the selected logical memory blocks that is associated with the current cluster and each of the selected logical memory blocks associated with the current cluster that is not associated with any of the physical memory blocks is mapped to a hole identifier that does not correspond to any physical memory block. - View Dependent Claims (7, 8, 9, 10, 11, 12)
-
-
13. A computer-implemented method of retrieving data in a file system having a mapping table arranged to map logical memory blocks to physical memory blocks, the method comprising the steps of:
-
requesting that a segment of data be read from physical memory, the segment of data being associated with selected logical memory blocks; and determining by reference to the mapping table whether any of the selected logical memory blocks are mapped to a hole identifier that indicates that the segment of data is stored in compressed form in the physical memory, wherein when it is determined that the segment of data is stored in compressed form, performing the sub-steps of, a) reading from physical memory into a compression buffer all physical memory blocks that are associated with the selected logical memory blocks of the segment of data, and b) expanding the physical memory blocks stored in the compression buffer so that the segment of data is then stored in expanded form in the compression buffer. - View Dependent Claims (14, 15, 16, 17, 18)
-
-
19. A computer apparatus for use in compressing and expanding a data segment in a file system, the data segment being associated with selected logical memory blocks, the computer apparatus comprising:
-
a central processing unit; random access memory in communication with the central processing unit; a mass storage device in communication with the central processing unit; a mapping table arranged to map the selected logical memory blocks of the data segment to associated physical memory blocks of the mass storage device, such that when the data segment is stored in a compressed form on the mass storage device the compressed data segment occupies fewer physical memory blocks than the expanded data segment, and such that each physical memory block associated with the compressed data is mapped to by a mapping table entry corresponding to one of the selected logical memory blocks and each of the selected logical memory blocks that is not associated with any of the physical memory blocks is mapped to a hole identifier that does not correspond to any physical memory block. - View Dependent Claims (20, 21, 22)
-
-
23. A computer program product comprising a computer-usable medium having computer-readable code embodied thereon for compressing and expanding data in a file system of a computer, the file system having a mapping table arranged to map logical memory blocks to physical memory blocks, the computer program product comprising the following computer-readable program code for effecting actions in the computer:
-
program code for requesting that a segment of data be written to physical memory, the segment of data being associated with selected logical memory blocks; program code for compressing the segment of data into compressed data such that the compressed data occupies fewer blocks of memory than the segment of data; program code for writing the compressed data to physical memory, the compressed data being written to at least one physical memory block; and program code for updating the mapping table such that each physical memory block associated with the compressed data is mapped to by a mapping table entry corresponding to one of the selected logical memory blocks and each of the selected logical memory blocks that is not associated with any of the physical memory blocks is mapped to a hole identifier that does not correspond to any physical memory block. - View Dependent Claims (24, 25, 26, 27, 28)
-
Specification