METHOD AND APPARATUS FOR COMPRESSION AND NETWORK TRANSPORT OF DATA IN SUPPORT OF CONTINUOUS AVAILABILITY OF APPLICATIONS
First Claim
1. A computer-implemented method of compressing data, comprising:
- a) receiving a current instance of data in an input buffer;
b) selecting a candidate chunk of a representation of the current instance;
c) computing a signature hash from a signature length range of data starting in the candidate chunk;
d) identifying a matching dictionary entry having a matching signature hash from a multi-tiered dictionary, wherein the matching dictionary entry prospectively identifies a location of a prior occurrence of a selected range of consecutive symbols including the signature length range of data within at least one of the representation of the current instance of data and a representation of a prior instance of data in the input buffer; and
e) forming a dedupe processed representation of the instance of data, wherein a dedupe item is substituted for the selected range of consecutive symbols if the selected range is verified as recurring, wherein the dedupe item identifies the location of the prior occurrence of the selected range.
5 Assignments
0 Petitions
Accused Products
Abstract
Methods and apparatus for compressing data for network transport in support of continuous availability of applications are described. One computer-implemented method of compressing data includes receiving a current instance of data in an input buffer. A candidate chunk of data is selected from the input buffer. A signature hash is computed from a signature length range of data within the candidate chunk. A matching dictionary entry having a matching signature hash from a multi-tiered dictionary is identified. The matching dictionary entry prospectively identifies a location of a prior occurrence of a selected range of consecutive symbols including the signature length range of data within at least one of the current instance of data and a prior instance of data in the input buffer. A dedupe processed representation of the instance of data is formed wherein a dedupe item is substituted for the selected range of consecutive symbols if the selected range is verified as recurring. The dedupe item identifies the location of the prior occurrence of the selected range in accordance with the matching dictionary entry.
89 Citations
34 Claims
-
1. A computer-implemented method of compressing data, comprising:
-
a) receiving a current instance of data in an input buffer; b) selecting a candidate chunk of a representation of the current instance; c) computing a signature hash from a signature length range of data starting in the candidate chunk; d) identifying a matching dictionary entry having a matching signature hash from a multi-tiered dictionary, wherein the matching dictionary entry prospectively identifies a location of a prior occurrence of a selected range of consecutive symbols including the signature length range of data within at least one of the representation of the current instance of data and a representation of a prior instance of data in the input buffer; and e) forming a dedupe processed representation of the instance of data, wherein a dedupe item is substituted for the selected range of consecutive symbols if the selected range is verified as recurring, wherein the dedupe item identifies the location of the prior occurrence of the selected range. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A computer-implemented method of compressing data, comprising:
-
a) receiving a current instance of data in an input buffer; and b) receiving at least one dedupe exclude range defining a portion of the input buffer to be excluded from dedupe compression, wherein the dedupe exclude range logically partitions the input buffer into at least one dedupe view range, wherein dedupe compression is a substitution of a dedupe item for occurrences of a recurring range of data, wherein the dedupe item identifies the location of a prior occurrence of the recurring range within at least one of a representation of the current instance of data and a representation of a prior instance of data in the input buffer. - View Dependent Claims (22, 23, 24, 25, 26)
-
-
27. A computer-implemented method of compressing data, comprising:
-
a) receiving a current instance of data in an input buffer; b) performing a repeat pattern replacement (RPR) compression on at least a portion of the current instance of data to store a corresponding RPR processed range at a next available position in a reference log, wherein the RPR compression substitutes an RPR item for a consecutive range of symbols, wherein the value of the RPR item is independent of the location of the consecutive range of symbols; and c) performing a dedupe compression on each RPR processed portion to store a corresponding dedupe processed range at a next available position in a temporary buffer, wherein a dedupe item is substituted for a selected range of consecutive symbols of the RPR processed range if the selected range is verified as recurring, wherein the dedupe item identifies an offset to the location of the prior occurrence of the selected range in a same RPR processed range or a prior RPR processed range from one of the current instance and a prior instance of data in the input buffer. - View Dependent Claims (28, 29, 30, 31, 32, 33, 34)
-
Specification