Content addressable data storage and compression for semi-persistent computer memory for a database management system
First Claim
1. A method of content addressable data storage and compression for semi-persistent computer memory for a database management system comprising:
- providing in the database management system a data structure that associates data identifiers and retrieval keys for memory blocks for storing in semi-persistent memory data from the database management system;
storing in the data structure a data identifier;
providing a chunk of data comprising a quantity of input data from the database management system;
retrieving a memory block from semi-persistent computer memory;
searching at a repeating memory interval through a search section of the chunk for a segment of the chunk that matches a memory block from computer memory, including;
calculating a weak checksum for the memory block;
calculating rolling weak checksums for segments of the search section of the chunk;
comparing the rolling weak checksums for the segments with the checksum for the memory block; and
if a segment is found with a rolling weak checksum equal to the weak checksum of the memory block;
calculating a strong checksum for the memory block;
calculating a strong checksum for the segment with the matching rolling weak checksum;
comparing the strong checksum of the memory block and the strong checksum for the segment with the equal rolling weak checksum;
determining that the search has found a segment having contents that match the contents of the memory block if the strong checksum of the memory block and the strong checksum for the segment with the matching rolling weak checksum are equal;
if a matching segment is found;
discarding the matching segment;
providing to the database management system a retrieval key for the memory block as a retrieval key for the matching segment;
storing in the data structure in the database management system the retrieval key for the matching segment in association with the data identifier;
identifying an unmatched portion of the chunk that does not match the memory block;
identifying a free memory block of a file system;
storing the unmatched portion semi-persistently in the free memory block;
providing to the database management system a retrieval key for the unmatched portion; and
storing in the data structure in the database management system the retrieval key for the unmatched portion in association with the data identifier.
1 Assignment
0 Petitions
Accused Products
Abstract
Content addressable data storage and compression for semi-persistent computer memory for a database management system including providing a data structure that associates data identifiers and retrieval keys for memory blocks for storing in semi-persistent memory data from the database management system; searching for a segment of a chunk of data from the database management system that matches a memory block from semi-persistent memory; and if a matching segment is found: discarding the matching segment; storing in the data structure in the database management system a retrieval key for the matching segment in association with a data identifier; identifying an unmatched portion of the chunk that does not match the memory block; storing the unmatched portion semi-persistently in a free memory block from a file system; and storing in the data structure in the database management system a retrieval key for the unmatched portion in association with the data identifier.
-
Citations
18 Claims
-
1. A method of content addressable data storage and compression for semi-persistent computer memory for a database management system comprising:
-
providing in the database management system a data structure that associates data identifiers and retrieval keys for memory blocks for storing in semi-persistent memory data from the database management system; storing in the data structure a data identifier; providing a chunk of data comprising a quantity of input data from the database management system; retrieving a memory block from semi-persistent computer memory; searching at a repeating memory interval through a search section of the chunk for a segment of the chunk that matches a memory block from computer memory, including;
calculating a weak checksum for the memory block;
calculating rolling weak checksums for segments of the search section of the chunk;
comparing the rolling weak checksums for the segments with the checksum for the memory block; and
if a segment is found with a rolling weak checksum equal to the weak checksum of the memory block;
calculating a strong checksum for the memory block;
calculating a strong checksum for the segment with the matching rolling weak checksum;
comparing the strong checksum of the memory block and the strong checksum for the segment with the equal rolling weak checksum;determining that the search has found a segment having contents that match the contents of the memory block if the strong checksum of the memory block and the strong checksum for the segment with the matching rolling weak checksum are equal; if a matching segment is found; discarding the matching segment; providing to the database management system a retrieval key for the memory block as a retrieval key for the matching segment; storing in the data structure in the database management system the retrieval key for the matching segment in association with the data identifier; identifying an unmatched portion of the chunk that does not match the memory block; identifying a free memory block of a file system; storing the unmatched portion semi-persistently in the free memory block; providing to the database management system a retrieval key for the unmatched portion; and storing in the data structure in the database management system the retrieval key for the unmatched portion in association with the data identifier. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A system of content addressable data storage and compression for semi-persistent computer memory for a database management system comprising:
-
means for providing in the database management system a data structure that associates data identifiers and retrieval keys for memory blocks for storing in semi-persistent memory data from the database management system; means for storing in the data structure a data identifier; means for providing a chunk of data comprising a quantity of input from the database management system; means for retrieving a memory block from semi-persistent computer memory; means for searching at a repeating memory interval through a search section of a chunk for a segment of the chunk that matches a memory block from computer memory, including means for;
calculating a weak checksum for the memory block;
calculating rolling weak checksums for segments of the search section of the chunk;
comparing the rolling weak checksums for the segments with the checksum for the memory block; and
if a segment is found with a rolling weak checksum equal to the weak checksum of the memory block;
calculating a strong checksum for the memory block;
calculating a strong checksum for the segment with the matching rolling weak checksum;
comparing the strong checksum of the memory block and the strong checksum for the segment with the equal rolling weak checksum;means for determining that the search has found a segment having contents that match the contents of the memory block if the strong checksum of the memory block and the strong checksum for the segment with the matching rolling weak checksum are equal; means for discarding a matching segment; means for providing to the database management system a retrieval key for the memory block as a retrieval key for the matching segment; means for storing in the data structure in the database management system the retrieval key for the matching segment in association with the data identifier; means for identifying an unmatched portion of the chunk that does not match the memory block; means for identifying a free memory block of a file system; means for storing the unmatched portion semi-persistently in the free memory block; means for providing to the database management system a retrieval key for the unmatched portion; and means for storing in the data structure in the database management system the retrieval key for the unmatched portion in association with the data identifier. - View Dependent Claims (13, 14, 15)
-
-
16. A computer program product of content addressable data storage and compression for semi-persistent computer memory for a database management computer program product comprising:
-
a recording medium; means, recorded on the recording medium, for providing in the database management computer program product a data structure that associates data identifiers and retrieval keys for memory blocks for storing in semi-persistent memory data from the database management computer program product; means, recorded on the recording medium, for storing in the data structure a data identifier; means, recorded on the recording medium, for providing a chunk of data comprising a quantity of input data from the database management computer program product; means, recorded on the recording medium, for retrieving a memory block from semi-persistent computer memory; means, recorded on the recording medium, for searching at a repeating memory interval through a search section of a chunk for a segment of the chunk that matches a memory block from computer memory, including means, recorded on the recording medium, for;
calculating a weak checksum for the memory block;
calculating rolling weak checksums for segments of the search section of the chunk;
comparing the rolling weak checksums for the segments with the checksum for the memory block; and
if a segment is found with a rolling weak checksum equal to the weak checksum of the memory block;
calculating a strong checksum for the memory block;
calculating a strong checksum for the segment with the matching rolling weak checksum;
comparing the strong checksum of the memory block and the strong checksum for the segment with the equal rolling weak checksum;means, recorded on the recording medium, for determing that the search has found a segment having contents that match the contents of the memory block if the strong checksum of the memory block and the strong checksum for the segment with the matching rolling weak checksum are equal; means, recorded on the recording medium, for discarding a matching segment; means, recorded on the recording medium, for providing to the database management computer program product a retrieval key for the memory block as a retrieval key for the matching segment; means, recorded on the recording medium, for storing in the data structure in the database management computer program product the retrieval key for the matching segment in association with the data identifier; means, recorded on the recording medium, for identifying an unmatched portion of the chunk that does not match the memory block; means, recorded on the recording medium, for identifying a free memory block of a file computer program product; means, recorded on the recording medium, for storing the unmatched portion semi-persistently in the free memory block; means, recorded on the recording medium, for providing to the database management computer program product a retrieval key for the unmatched portion; and means, recorded on the recording medium, for storing in the data structure in the database management computer program product the retrieval key for the unmatched portion in association with the data identifier. - View Dependent Claims (17, 18)
-
Specification