Hardware efficient rabin fingerprints
First Claim
1. A method comprising:
- selecting an irreducible polynomial;
generating a Fresh function based on the irreducible polynomial and an input polynomial determined by a size of an input message;
computing a first fingerprint for a first shingle of data by;
splitting the Fresh function into a first Fresh portion and a second Fresh portion;
splitting the first shingle of data into a first shingle portion and a second shingle portion;
computing a fingerprint for each of the first shingle portion and the second shingle portion using the first Fresh portion and the second Fresh portion, wherein the second Fresh portion uses the fingerprint for the first shingle portion as an input;
generating a first Shift function, wherein the first Shift function uses the first fingerprint for the first shingle of data as an input; and
computing a second fingerprint for a second shingle of data using the first Shift function.
7 Assignments
0 Petitions
Accused Products
Abstract
An approach for fingerprinting large data objects at the wire speed has been disclosed. The techniques include Fresh/Shift pipelining, split Fresh, optimization, online channel sampling, and pipelined selection. The architecture can also be replicated to work in parallel for higher system throughput. Fingerprinting may provide an efficient mechanism for identifying duplication in a data stream, and deduplication based on the identified fingerprints may provide reduced storage costs, reduced network bandwidth consumption, reduced processing time and other benefits. In some embodiments, fingerprinting may be used to ensure or verify data integrity and may facilitate detection of corruption or tampering. An efficient manner of generating fingerprints (either via hardware, software, or a combination) may reduce a computation load and/or time required to generate fingerprints.
6 Citations
17 Claims
-
1. A method comprising:
-
selecting an irreducible polynomial; generating a Fresh function based on the irreducible polynomial and an input polynomial determined by a size of an input message; computing a first fingerprint for a first shingle of data by; splitting the Fresh function into a first Fresh portion and a second Fresh portion; splitting the first shingle of data into a first shingle portion and a second shingle portion; computing a fingerprint for each of the first shingle portion and the second shingle portion using the first Fresh portion and the second Fresh portion, wherein the second Fresh portion uses the fingerprint for the first shingle portion as an input; generating a first Shift function, wherein the first Shift function uses the first fingerprint for the first shingle of data as an input; and computing a second fingerprint for a second shingle of data using the first Shift function. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A system comprising:
-
one or more processors; and a memory storing instructions, which when executed by the one or more processors, cause the one or more processors to; select an irreducible polynomial; generate a Fresh function based on the irreducible polynomial and an input polynomial determined by a size of an input message; split the Fresh function into a first Fresh portion and a second Fresh portion; split a first shingle of data into a first shingle portion and a second shingle portion; compute a fingerprint portion for the first shingle portion using the first Fresh portion; compute a first fingerprint for the first shingle of data using the second Fresh portion, wherein the second Fresh portion uses the fingerprint portion and the second shingle portion as inputs; generate a first Shift function, wherein the first Shift function uses the first fingerprint for the first shingle of data as an input; and compute a second fingerprint for a second shingle of data using the first Shift function. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer program product comprising a non-transitory computer useable medium storing a computer readable program, wherein the computer readable program, when executed on a computer, causes the computer to:
-
select an irreducible polynomial; generate a Fresh function as a first Fresh portion and a second Fresh portion based on the irreducible polynomial and an input polynomial determined by a size of an input message, wherein the Fresh function is configured to; split a first shingle of data into a first shingle portion and a second shingle portion; compute a fingerprint portion for the first shingle portion using the first Fresh portion; compute a first fingerprint for the first shingle of data using the second Fresh portion, wherein the second Fresh portion uses the fingerprint portion and the second shingle portion as inputs; and generate a first Shift function, wherein the first Shift function uses the first fingerprint for the first shingle of data as an input and is configured to compute a second fingerprint for a second shingle of data. - View Dependent Claims (14, 15, 16, 17)
-
Specification