METHOD AND SYSTEM FOR PROCESSING DATA
First Claim
1. A method for processing data by a processor device, comprising:
- receiving a plurality of non-sequential write operations for deduplication storage of the data;
storing the data in a plurality of user file locations;
accumulating the data in a plurality of buffers;
restructuring the data in the plurality of buffers to form sequential data;
providing the sequential data as a plurality of streams to a stream-based deduplication algorithm for processing and storage; and
mapping, via a disk map, the data stored in the plurality of user file locations to the plurality of streams, wherein at least a portion of the data stored in the plurality of user file locations is mapped to at least two streams in the plurality of streams.
0 Assignments
0 Petitions
Accused Products
Abstract
Methods, computer systems, and computer program products for processing data a computing environment are provided. The computer environment for data deduplication storage receives a plurality of write operations for deduplication storage of the data. The data is buffered in a plurality of buffers with overflow temporarily stored to a memory hierarchy when the data received for deduplication storage is sequential or non sequential. The data is accumulated and updated in the plurality of buffers per a data structure, the data structure serving as a fragment map between the plurality of buffers and a plurality of user file locations. The data is restructured in the plurality of buffers to form a complete sequence of a required sequence size. The data is provided as at least one stream to a stream-based deduplication algorithm for processing and storage.
41 Citations
20 Claims
-
1. A method for processing data by a processor device, comprising:
-
receiving a plurality of non-sequential write operations for deduplication storage of the data; storing the data in a plurality of user file locations; accumulating the data in a plurality of buffers; restructuring the data in the plurality of buffers to form sequential data; providing the sequential data as a plurality of streams to a stream-based deduplication algorithm for processing and storage; and mapping, via a disk map, the data stored in the plurality of user file locations to the plurality of streams, wherein at least a portion of the data stored in the plurality of user file locations is mapped to at least two streams in the plurality of streams. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for processing data in a computing environment, comprising:
a processor for; receiving a plurality of non-sequential write operations for deduplication storage of the data; storing the data in a plurality of user file locations; accumulating the data in a plurality of buffers; restructuring the data in the plurality of buffers to form sequential data; providing the sequential data as a plurality of streams to a stream-based deduplication algorithm for processing and storage; and mapping, via a disk map, the data stored in the plurality of user file locations to the plurality of streams, wherein at least a portion of the data stored in the plurality of user file locations is mapped to at least two streams in the plurality of streams. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
15. A computer program product for processing data in a computing environment by a processor device, the computer program product comprising a non-transitory computer-readable storage medium comprising:
-
computer code for receiving a plurality of non-sequential write operations for deduplication storage of the data; computer code for storing the data in a plurality of user file locations; computer code for accumulating the data in a plurality of buffers; computer code for restructuring the data in the plurality of buffers to form sequential data; computer code for providing the sequential data as a plurality of streams to a stream-based deduplication algorithm for processing and storage; and computer code for mapping, via a disk map, the data stored in the plurality of user file locations to the plurality of streams, wherein at least a portion of the data stored in the plurality of user file locations is mapped to at least two streams in the plurality of streams. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification