Data sharing and recovery within a network of untrusted storage devices using data object fingerprinting
First Claim
1. A data sharing method, comprising:
- communicatively linking a plurality of computer devices via a communications network, wherein each of the computer devices includes a data storage storing a plurality of data objects;
providing a fingerprint generation module on each of the computer devices;
with each of the computer devices, processing each data object of at least a portion of the data objects in the data storage with the fingerprint generation module to generate a fingerprint for the data object, wherein the fingerprints are stored in a searchable manner in a data store, wherein each fingerprint comprises a hash value that is output from a hashing algorithm run on an associated one of the data objects, wherein the fingerprint generation module parses the hash value into a plurality of sub-strings, wherein each of the plurality of sub-strings defines a corresponding sub-directory in a directory structure of a data storage, and wherein the sub-directories define a location for an instance of the associated one of the data objects; and
with a data manager on one of the computer devices, retrieving from another one of the computer devices a copy of one of the data objects in the data storage associated with the one of the computer devices using the fingerprint generated for the one of the data objects.
2 Assignments
0 Petitions
Accused Products
Abstract
A data sharing method using fingerprinted data objects for sharing data among untrusted network devices. Each peer device is adapted for storing a plurality of data objects, and a fingerprint generator is used to generate a fingerprint for each stored data object available for sharing or for recovery. The fingerprints are stored in a local data store, and a data manager running on one of the computer devices retrieves from another of the computer devices a copy of one of its data objects through the use of the associated fingerprints. The fingerprints include a hash value output from a strong hashing algorithm. The retrieving includes transmitting query messages with the fingerprints of the needed data objects to the networked, peer devices and then verifying the integrity of received data objects by generating a fingerprint of the received data objects that can be compared with the ones provided in the queries.
64 Citations
19 Claims
-
1. A data sharing method, comprising:
-
communicatively linking a plurality of computer devices via a communications network, wherein each of the computer devices includes a data storage storing a plurality of data objects; providing a fingerprint generation module on each of the computer devices; with each of the computer devices, processing each data object of at least a portion of the data objects in the data storage with the fingerprint generation module to generate a fingerprint for the data object, wherein the fingerprints are stored in a searchable manner in a data store, wherein each fingerprint comprises a hash value that is output from a hashing algorithm run on an associated one of the data objects, wherein the fingerprint generation module parses the hash value into a plurality of sub-strings, wherein each of the plurality of sub-strings defines a corresponding sub-directory in a directory structure of a data storage, and wherein the sub-directories define a location for an instance of the associated one of the data objects; and with a data manager on one of the computer devices, retrieving from another one of the computer devices a copy of one of the data objects in the data storage associated with the one of the computer devices using the fingerprint generated for the one of the data objects. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A data recovery method, comprising:
-
with at least one microprocessor, executing at least one fingerprint generation module to process a plurality of data objects to define a fingerprint for each of the data objects, wherein each of the fingerprints comprises a hash value for an associated one of the data objects, wherein the fingerprint generation module is executed to parse the hash value into a plurality of sub-strings, wherein each of the plurality of sub-strings defines a corresponding sub-directory in a directory structure of a data storage device, and wherein the sub-directories define a location for an instance of the associated one of the data objects; storing the data objects in a distributed manner in data storage devices of a plurality of peer devices linked to a network, wherein each of the data storage devices stores a subset of the data objects with at least some of the subsets storing differing ones of the data objects; storing the fingerprints associated with each of the subsets of the data objects in memory accessible by one of the peer devices associated with one of the data storage devices storing one of the subsets of the data objects; responsive to a loss of at least a portion of one of the data object subsets from one of the data storage devices, operating a data manager at an associated peer device to recover the portion of the one of the data object subsets, wherein the operating comprises; retrieving, from the memory, the fingerprints associated with the portion of the one of the data object subsets to be recovered; and transmitting query messages to the peer devices to locate the portion of the one of the data object subsets to be recovered, wherein each query message includes the retrieved fingerprints. - View Dependent Claims (10, 11, 12, 13)
-
-
14. A data sharing and recovery system comprising:
-
a communications network; and a plurality of peer systems each including a processor providing a data manager with a fingerprint generation module, wherein each of the peer systems is linked to the communications network for communicating with other ones of the peer systems; wherein each of the peer systems includes a repository storing a number of data objects; and wherein each of the peer systems includes a data store storing a fingerprint generated by the fingerprint generation module for each of the data objects in the associated peer system repository, wherein each fingerprint comprises a location component generated based on an output of a hash function applied to an associated one of the data objects, wherein the hash function output is parsed to divide it into a number of segments, and wherein the number of segments are used to define a respective number of nodes in the repository which collectively represent a file path for the associated one of the data objects. - View Dependent Claims (15, 16, 17, 18, 19)
-
Specification