Method for partitioning a block of data into subblocks and for storing and communcating such subblocks
DCFirst Claim
1. A method for organizing a block b of digital data for storage, communication, or comparison, comprising the step of:
- partitioning said block b into a plurality of subblocks at at least one position k|k+1 within said block,for which b[k-A+1 . . . k+B] satisfies a predetermined constraint, andwherein A and B are natural numbers.
4 Assignments
Litigations
0 Petitions
Reexamination
Accused Products
Abstract
This invention provides a method and apparatus for detecting common spans within one or more data blocks by partitioning the blocks (FIG. 4) into subblocks and searching the group of subblocks (FIG. 12) (or their corresponding hashes (FIG. 13)) for duplicates. Blocks can be partitioned into subblocks using a variety of methods, including methods that place subblock boundaries at fixed positions (FIG. 3), methods that place subblock boundaries at data-dependent positions (FIG. 3), and methods that yield multiple overlapping subblocks (FIG. 6). By comparing the hashes of subblocks, common spans of one or more blocks can be identified without ever having to compare the blocks or subblocks themselves (FIG. 13). This leads to several applications including an incremental backup system that backs up changes rather than changed files (FIG. 25), a utility that determines the similarities and differences between two files (FIG. 13), a file system that stores each unique subblock at most once (FIG. 26), and a communications system that eliminates the need to transmit subblocks already possessed by the receiver (FIG. 19).
795 Citations
30 Claims
-
1. A method for organizing a block b of digital data for storage, communication, or comparison, comprising the step of:
-
partitioning said block b into a plurality of subblocks at at least one position k|k+1 within said block, for which b[k-A+1 . . . k+B] satisfies a predetermined constraint, and wherein A and B are natural numbers. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method for comparing one or more blocks, comprising the steps of:
-
organizing a block b of digital data for the purpose of comparison, comprising the step of; partitioning said block b into a plurality of subblocks at at least one position k|k+1 within said block; for which b[k-A+1 . . . k+B] satisfies a predetermined constraint; and wherein A and B are natural numbers, forming a projection of each said block, being a collection of elements, wherein each element comprises a selected one of a subblock, an identity of a subblock, and a reference of a subblock, and comparing the elements of said projections of said blocks.
-
-
12. A method for representing one or more blocks comprising a collection of subblocks and block representatives which are mapped to lists of entries which identify subblocks;
- said method comprising the step of modifying one of said blocks including the steps of;
partitioning said block into a plurality of subblocks at at least one position k|k+1 within said block, for which b[k-A+1 . . . k+B] satisfies a predetermined constraint, and wherein A and B are natural numbers, adding to said collection of subblocks zero or more subblocks which are not already in said collection, and updating said subblock list associated with said modified block.
- said method comprising the step of modifying one of said blocks including the steps of;
-
13. A method for representing one or more blocks comprising a collection of subblocks and block representatives which are mapped to lists of entries which identify subblocks;
- said method comprising the step of modifying one of said blocks including the steps of;
partitioning said block into a plurality of subblocks at at least one position k|k+1 within said block, for which b[k-A+1 . . . k+B] satisfies a predetermined constraint, and wherein A and B are natural numbers, removing from said collection of subblocks zero or more subblocks, and updating said subblock list associated with said modified block.
- said method comprising the step of modifying one of said blocks including the steps of;
-
14. A method for representing one or more blocks comprising a collection of subblocks and block representatives which are mapped to lists of entries which identify subblocks;
- said method comprising the step of modifying one of said blocks including the steps of;
partitioning said block into a plurality of subblocks at at least one position k|k+1 within said block, for which b[k-A+1 . . . k+B] satisfies a predetermined constraint, and wherein A and B are natural numbers, adding to said collection of subblocks zero or more subblocks that are not already in the collection, removing from said collection of subblocks zero or more subblocks, and updating said subblock list associated with said modified block.
- said method comprising the step of modifying one of said blocks including the steps of;
-
15. A method for an entity E1 to communicate a block X to E2 where E1 possesses the knowledge that E2 possesses a group of Y subblocks Y1 . . . Ym, comprising the steps of:
-
partitioning said block X into a plurality of subblocks X1 . . . Xn at at least one position k|k+1 within said block, for which X[k-A+1 . . . k+B] satisfies a predetermined constraint, and wherein A and B are natural numbers, and transmitting from E1 to E2 the contents of zero or more subblocks in X1 and the remaining subblocks as references to subblocks in Y1 . . . Ym, and to subblocks transmitted.
-
-
16. A method for an entity E1 to communicate one or more subblocks of a group X of subblocks X1 . . . Xn to E2 where E1 possesses the knowledge that E2 possesses a block Y, comprising the steps of:
-
partitioning said block Y into a plurality of subblocks Y1 . . . Ym at at least one position k|k+1 within said block, for which Y[k-A+1 . . . k+B] satisfies a predetermined constraint, and wherein A and B are natural numbers, and transmitting from E1 to E2 the contents of zero or more subblocks in X, and the remaining subblocks as references to subblocks in Y, and to subblocks already transmitted.
-
-
17. A method for an entity E1 to communicate a block X to E2 where E1 possesses the knowledge that E2 possesses a block Y, comprising the steps of:
-
partitioning said block X into a plurality of subblocks X1 . . . Xn at at least one position k|k+1 within said block, for which X[k-A+1 . . . k+B] satisfies a predetermined constraint, and wherein A and B are natural numbers, partitioning said block Y into a plurality of subblocks Y1 . . . Ym at at least one position k|k+1 within said block, for which Y[k-A+1 . . . k+B] satisfies a predetermined constraint, and wherein A and B are natural numbers, and transmitting from E1 to E2 the contents of zero or more subblocks in X, and the remaining subblocks as references to subblocks in Y, and to subblocks already transmitted.
-
-
18. A method for constructing a block D from a block X and a group Y of subblocks Y1 . . . Ym such that X can be constructed from Y and D, comprising the steps of:
-
partitioning said block X into a plurality of subblocks X1 . . . Xn at at least one position k|k+1 within said block, for which X[k-A+1 . . . k+B] satisfies a predetermined constraint, and wherein A and B are natural numbers, and constructing D from a selected at least one of; the contents of zero or more subblocks in X, references to zero or more subblocks in Y, and references to zero or more subblocks in D.
-
-
19. A method for constructing a block D from a group X of subblocks X1 . . . Xn and a block Y such that X can be constructed from Y and D, comprising the steps of:
-
partitioning said block Y into a plurality of subblocks Y1 . . . Ym at at least one position k|k+1 within said block, for which Y[k-A+1 . . . k+B] satisfies a predetermined constraint, and wherein A and B are natural numbers, and constructing D from a selected at least one of; the contents of zero or more subblocks in X, references to zero or more subblocks in Y, and references to zero of more subblocks in D.
-
-
20. A method for constructing a block D from a block X and a block Y such that X can be constructed from Y and D, comprising the steps of:
-
partitioning said block X into a plurality of subblocks X1 . . . Xn at at least one position k|k+1 within said block, for which X[k-A+1 . . . k+B] satisfies a predetermined constraint, and wherein A and B are natural numbers, partitioning said block Y into a plurality of subblocks Y1 . . . Ym at at least one position k|k+1 within said block, for which Y[k-A+1 . . . k+B] satisfies a predetermined constraint, and wherein A and B are natural numbers, and constructing D from a selected at least one of; the contents of zero or more in X, references to zero or more subblocks in Y, and references to zero or more subblocks in D.
-
-
21. A method for constructing a block D from a block X and a projection Y said projection comprising a collection of elements wherein said elements comprises a subblock in Y, an identity of a subblock in Y, or a reference of a subblock in Y, such that X can be constructed from Y and D, comprising the steps of:
-
partitioning said block X into a plurality of subblocks X1 . . . Xn at at least one position k|k+1 within said block, for which X[k-A+1 . . . k+B] satisfies a predetermined constraint, and wherein A and B are natural numbers, and constructing D from a selected at least one of; the contents of zero or more in X, references to zero or more subblocks in Y, and references to zero or more subblocks in D.
-
-
22. A method for constructing a block X from a block Y and a block D, comprising the steps of:
-
partitioning said block Y into a plurality of subblocks Y1 . . . Ym at at least one position k|k+1 within said block, for which Y[k-A+1 . . . k+B] satisfies a predetermined constraint, and wherein A and B are natural numbers, and constructing X from D and Y by constructing the subblocks of X based on a selected at least one of; subblocks contained within D, references in D to subblocks in Y, and references to D to subblocks in D.
-
-
23. A method for constructing a group X of subblocks X1 . . . Xn from a block Y and a block D, comprising the steps of:
-
partitioning said block Y into a plurality of subblocks Y1 . . . Ym at at least one position k|k+1 within said block, for which Y[k-A+1 . . . k+b] satisfies a predetermined constraint, and wherein A and B are natural numbers, and constructing X1 . . . Xn from D and Y based on a selected at least one of; subblocks contained within D, references in D to subblocks in Y, and references to D to subblocks in D.
-
-
24. A method for communicating a data block X from one entity E1 to another entity E2, comprising the steps of:
-
partitioning said block X into a plurality of subblocks X1 . . . Xn at at least one position k|k+1 within said block, for which X[k-A+1 . . . k+B] satisfies a predetermined constraint, and wherein A and B are natural numbers, transmitting from E1 to E2 an identity of at least one subblock, transmitting from E2 to E1 information communicating the presence or absence of subblocks at E2, and transmitting from E1 to E2 at least the subblocks identified as not being present at E2.
-
-
25. A method for communicating a block X from one entity E1 to another entity E2, comprising the steps of:
-
partitioning said block X into a plurality of subblocks X1 . . . Xn at at least one position k|k+1 within said block, for which X[k-A+1 . . . k+B] satisfies a predetermined constraint, and wherein A and B are natural numbers, transmitting from E2 to E1 information communicating the presence or absence at E2 of members of a group Y of subblocks Y1 . . . Ym, and transmitting from E1 to E2 the contents of zero or more subblocks in X, and the remaining subblocks as references to subblocks in Y1 . . . Ym and to subblocks already transmitted.
-
-
26. A method for an entity E2 to communicate to an entity E1 the fact that E2 possesses a block Y, comprising the steps of:
-
partitioning said block Y into a plurality of subblocks Y1 . . . Ym at at least one position k|k+1 within said block, for which Y[k-A+1 . . . k+B] satisfies a predetermined constraint, and wherein A and B are natural numbers, and transmitting from E2 to E1 references of the subblocks Y1 . . . Ym.
-
-
27. A method for an entity E1 to communicate a subblock X1 to an entity E2, comprising the steps of:
-
partitioning said block X into a plurality of subblocks X1 . . . Xn at at least one position k|k+1 within said block, for which X[k-A+1 . . . k+B] satisfies a predetermined constraint, and wherein A and B are natural numbers, transmitting from E2 to E1 an identity of Xi, transmitting Xi from E1 to E2.
-
-
28. An apparatus for organizing a block b of digital data for storage, communication, or comparison, comprising
means for partitioning said block b into a plurality of subblocks at at least one position k|k+1 within said block, for which b[k-A+1 . . . k+B] satisfies a predetermined constraint, and wherein A and B are natural numbers.
Specification