Data deduplication by separating data from meta data
First Claim
Patent Images
1. A method, comprising:
- under control of a data deduplication system,receiving chunks of co-mingled data having file data and meta data;
splitting the chunks of data into a file data stream stored in a first file and a meta data stream stored in a second file; and
deduplicating the file data in the first file without deduplicating the meta data in the second file; and
in response to receiving a request for the chunks of data,alternating between reading the meta data stream and the file data stream to fill a data buffer by;
determining whether to read the meta data stream or the file data stream next based on header information in the meta data;
in response to determining that the meta data stream is to be read, reading the meta data into the data buffer; and
in response to determining that the file data stream is to be read, reading the file data into the data buffer while reassembling the file data that was previously deduplicated using a reconstruction structure to put together the chunks of the file data stream; and
returning the chunks of co-mingled data having the file data and the meta data in the data buffer.
0 Assignments
0 Petitions
Accused Products
Abstract
Provided are techniques for data deduplication. A chunk of data and a mapping of boundaries between file data and meta data in the chunk of data are received. The mapping is used to split the chunk of data into a file data stream and a meta data stream and to store file data from the file data stream in a first file and to store meta data from the meta data stream in a second file, wherein the first file and the second file are separate files. The file data in the first file is deduplicated.
73 Citations
18 Claims
-
1. A method, comprising:
under control of a data deduplication system, receiving chunks of co-mingled data having file data and meta data; splitting the chunks of data into a file data stream stored in a first file and a meta data stream stored in a second file; and deduplicating the file data in the first file without deduplicating the meta data in the second file; and in response to receiving a request for the chunks of data, alternating between reading the meta data stream and the file data stream to fill a data buffer by; determining whether to read the meta data stream or the file data stream next based on header information in the meta data; in response to determining that the meta data stream is to be read, reading the meta data into the data buffer; and in response to determining that the file data stream is to be read, reading the file data into the data buffer while reassembling the file data that was previously deduplicated using a reconstruction structure to put together the chunks of the file data stream; and returning the chunks of co-mingled data having the file data and the meta data in the data buffer. - View Dependent Claims (2, 3, 4, 5, 6)
-
7. A computer program product comprising a computer-readable medium including computer readable instructions, wherein the computer readable instructions, when executed by a processor on a computer, causes the computer to:
under control of a data deduplication system, receive chunks of co-mingled data having file data and meta data; split the chunks of data into a file data stream stored in a first file and a meta data stream stored in a second file; and deduplicate the file data in the first file without deduplicating the meta data in the second file; and in response to receiving a request for the chunks of data, alternate between reading the meta data stream and the file data stream to fill a data buffer by; determining whether to read the meta data stream or the file data stream next based on header information in the meta data; in response to determining that the meta data stream is to be read, reading the meta data into the data buffer; and in response to determining that the file data stream is to be read, reading the file data into the data buffer while reassembling the file data that was previously deduplicated using a reconstruction structure to put together the chunks of the file data stream; and return the chunks of co-mingled data having the file data and the meta data in the data buffer. - View Dependent Claims (8, 9, 10, 11, 12)
-
13. A system, comprising:
-
hardware logic of a data deduplication system performing operations, the operations comprising; receiving chunks of co-mingled data having file data and meta data; splitting the chunks of data into a file data stream stored in a first file and a meta data stream stored in a second file; and deduplicating the file data in the first file without deduplicating the meta data in the second file; and in response to receiving a request for the chunks of data, alternating between reading the meta data stream and the file data stream to fill a data buffer by; determining whether to read the meta data stream or the file data stream next based on header information in the meta data; in response to determining that the meta data stream is to be read, reading the meta data into the data buffer; and in response to determining that the file data stream is to be read, reading the file data into the data buffer while reassembling the file data that was previously deduplicated using a reconstruction structure to put together the chunks of the file data stream; and returning the chunks of co-mingled data having the file data and the meta data in the data buffer. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification