Hash file system and method for use in a commonality factoring system
First Claim
1. A computing environment comprising:
- at least one list for maintaining portions of digital sequences and corresponding probabilistically unique identifiers for each of said portions of said digital sequences;
at least one new digital sequence;
at least one partitioning mechanism for dividing said new digital sequence into a plurality of shorter digital sequences and producing a probabilistically unique identifier for each of said shorter digital sequences; and
a comparison mechanism for determining if any one of said probabilistically unique identifiers for each of said plurality of shorter digital sequences is currently maintained in said list.
14 Assignments
0 Petitions
Accused Products
Abstract
A system and method for a computer file system that is based and organized upon hashes and/or strings of digits of certain, different, or changing lengths and which is capable of eliminating or screening redundant copies of aggregate blocks of data (or parts of data blocks) from the system. The hash file system of the present invention utilizes hash values for computer files or file pieces which may be produced by a checksum generating program, engine or algorithm such as industry standard MD4, MD5, SHA or SHA-1 algorithms. Alternatively, the hash values may be generated by a checksum program, engine, algorithm or other means that produces an effectively unique hash value for a block of data of indeterminate size based upon a non-linear probablistic mathematical algorithm.
554 Citations
45 Claims
-
1. A computing environment comprising:
-
at least one list for maintaining portions of digital sequences and corresponding probabilistically unique identifiers for each of said portions of said digital sequences;
at least one new digital sequence;
at least one partitioning mechanism for dividing said new digital sequence into a plurality of shorter digital sequences and producing a probabilistically unique identifier for each of said shorter digital sequences; and
a comparison mechanism for determining if any one of said probabilistically unique identifiers for each of said plurality of shorter digital sequences is currently maintained in said list. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27)
-
-
28. A method for establishing an identifier for at least a portion of a digital sequence comprising:
-
performing a function on said at least a portion of said digital sequence to produce a probabilistically unique symbol therefore;
establishing a correspondence between said at least a portion of said digital sequence and said probabilistically unique symbol; and
utilizing said probabilistically unique symbol as said identifier. - View Dependent Claims (29, 30, 31, 32, 33, 34, 35, 36)
hashing said at least a portion of said digital sequence to produce said probabilistically unique symbol.
-
-
35. The method of claim 34 wherein said step of hashing is carried out by means of an industry standard digest algorithm.
-
36. The method of claim 35 wherein said step of hashing is carried out by means of one of an MD4, MD5, SHA or SHA-1 algorithm.
-
37. A computer program product comprising:
-
a computer usable medium having computer readable code embodied therein for establishing an identifier for at least a portion of a digital sequence comprising;
computer readable program code devices configured to cause a computer to effect performing a function on said at least a portion of said digital sequence to produce a probabilistically unique symbol therefore;
computer readable program code devices configured to cause a computer to effect establishing a correspondence between said at least a portion of said digital sequence and said probabilistically unique symbol; and
computer readable program code devices configured to cause a to effect utilizing said probabilistically unique symbol as said identifier. - View Dependent Claims (38, 39, 40, 41, 42, 43, 44, 45)
computer readable program code devices configured to cause a computer to effect hashing said at least a portion of said digital sequence to produce said probabilistically unique symbol.
-
-
44. The computer program product of claim 43 wherein said computer readable program code devices configured to cause a computer to effect hashing is carried out by means of an industry standard digest algorithm.
-
45. The computer program product of claim 44 wherein said computer readable program code devices configured to cause a computer to effect hashing is carried out by means of one of an MD4, MD5, SHA or SHA-1 algorithm.
Specification