Locating potentially identical objects across multiple computers based on stochastic partitioning of workload
First Claim
1. A method comprising:
- selecting, for each of a plurality of objects stored on a plurality of computers in a network, a portion of object information corresponding to the object;
using a stochastic partitioning process to identify which of the plurality of computers to communicate the object information to for identification of potentially identical objects on the plurality of computers.
1 Assignment
0 Petitions
Accused Products
Abstract
Potentially identical objects (e.g., files) are located across multiple computers based on stochastic partitioning of workload. For each of a plurality of objects stored on a plurality of computers in a network, a portion of object information corresponding to the object is selected. The object information can be generated in a variety of manners (e.g., based on hashing the object, based on characteristics of the object, and so forth). Any of a variety of portions of the object information can be used (e.g., the least significant bits of the object information). A stochastic partitioning process is then used to identify which of the plurality of computers to communicate the object information to for identification of potentially identical objects on the plurality of computers.
94 Citations
17 Claims
-
1. A method comprising:
-
selecting, for each of a plurality of objects stored on a plurality of computers in a network, a portion of object information corresponding to the object;
using a stochastic partitioning process to identify which of the plurality of computers to communicate the object information to for identification of potentially identical objects on the plurality of computers. - View Dependent Claims (2, 3, 4, 5)
-
-
6. One or more computer-readable media having stored thereon a plurality of instructions that, when executed by one or more processors of one of a plurality of computers in a network, causes the one or more processors to perform the following acts:
-
selecting a portion of file information corresponding to a file;
identifying a mapping of the portion to one or more computers; and
communicating the file information to each of the identified one or more computers for identification of potentially identical files on the one or more computers. - View Dependent Claims (7, 8, 9, 10, 11, 12)
-
-
13. A system comprising:
-
an interface configured to allow the system to communicate with a plurality of other computers; and
a forwarding location determination module, coupled to the interface, configured to identify one or more of the plurality of other computers to communicate file information corresponding to a file to for identification of potentially identical files stored on the plurality of other computers by accessing a mapping of a portion of the file information to one or more computers. - View Dependent Claims (14, 15, 16, 17)
-
Specification