Analyzing found data in a distributed storage and task network
First Claim
1. A method for execution by a computing device in a distributed storage and computing network that stores a large volume of data objects as pluralities of sets of encoded data slices in a plurality of distributed storage and task execution (DST) units, the method comprises:
- establishing data identifying criteria for searching for like data objects of the large volume of data objects;
establishing data analyzing criteria for analyzing found data objects;
establishing distributed computing criteria based on the data identifying criteria, the data analyzing criteria, and a slice grouping storage type indication, wherein, for at least one of the found data objects, a DST unit of a set of DST units of the plurality of DST units stores a contiguous data chunk or error code data partition, and wherein the slice grouping storage type indication indicates processing of a contiguous data chunk or processing of the error coded data partition;
distributing the data identifying criteria and the data analyzing criteria to a decode threshold number of DST units in accordance with the distributed computing criteria, wherein the decode threshold number corresponds to a minimum number of encoded data slices of a set of encoded data slices of the pluralities of sets of encoded data slices that is needed to recover a data segment of a corresponding one of the data objects, and wherein the decode threshold number of DST units is less than a number of the DST units storing the pluralities of sets of encoded data slices;
receiving a set of network data partial resultants from the decode threshold number of DST units, wherein the decode threshold number of DST units generates the set of network data partial resultants based on searching at least some of the large volume of data objects in accordance with the data identifying criteria, the data analyzing criteria, and the distributed computing criteria; and
processing the set of network data partial resultants to produce a network data resultant regarding the data on the network.
4 Assignments
0 Petitions
Accused Products
Abstract
A method begins by a dispersed storage (DS) processing module establishing data identifying criteria for searching data on a network, establishing data analyzing criteria for analyzing found data of the data on the network, and establishing distributed computing criteria. The method continues with the DS processing module distributing the data identifying criteria and the data analyzing criteria to a set of distributed storage and task (DST) units. The method continues with the DS processing module receiving a set of network data partial resultants from the set of DST units, wherein the set of DST units generates the set of network data partial results in accordance with the data identifying criteria to produce found data and analyzing the found data in accordance with the data analyzing criteria. The method continues with the DS processing module processing the set of network data partial resultants to produce a network data resultant.
-
Citations
18 Claims
-
1. A method for execution by a computing device in a distributed storage and computing network that stores a large volume of data objects as pluralities of sets of encoded data slices in a plurality of distributed storage and task execution (DST) units, the method comprises:
-
establishing data identifying criteria for searching for like data objects of the large volume of data objects; establishing data analyzing criteria for analyzing found data objects; establishing distributed computing criteria based on the data identifying criteria, the data analyzing criteria, and a slice grouping storage type indication, wherein, for at least one of the found data objects, a DST unit of a set of DST units of the plurality of DST units stores a contiguous data chunk or error code data partition, and wherein the slice grouping storage type indication indicates processing of a contiguous data chunk or processing of the error coded data partition; distributing the data identifying criteria and the data analyzing criteria to a decode threshold number of DST units in accordance with the distributed computing criteria, wherein the decode threshold number corresponds to a minimum number of encoded data slices of a set of encoded data slices of the pluralities of sets of encoded data slices that is needed to recover a data segment of a corresponding one of the data objects, and wherein the decode threshold number of DST units is less than a number of the DST units storing the pluralities of sets of encoded data slices; receiving a set of network data partial resultants from the decode threshold number of DST units, wherein the decode threshold number of DST units generates the set of network data partial resultants based on searching at least some of the large volume of data objects in accordance with the data identifying criteria, the data analyzing criteria, and the distributed computing criteria; and processing the set of network data partial resultants to produce a network data resultant regarding the data on the network. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A dispersed storage (DS) module operable in a computing device of a distributed storage and computing network that stores a large volume of data objects as pluralities of sets of encoded data slices in a plurality of distributed storage and task execution (DST) units, the DS module comprises:
-
a first module, when operable within a computing device, causes the computing device to; establish data identifying criteria for searching for like data objects of the large volume of data objects; a second module, when operable within the computing device, causes the computing device to; establish data analyzing criteria for analyzing found data objects; a third module, when operable within the computing device, causes the computing device to; establish distributed computing criteria based on the data identifying criteria, the data analyzing criteria, and a slice grouping storage type indication, wherein, for at least one of the found data objects, a DST unit of a set of DST units of the plurality of DST units stores a contiguous data chunk or error code data partition, and wherein the slice grouping storage type indication indicates processing of a contiguous data chunk or processing of the error coded data partition; and a fourth module, when operable within the computing device, causes the computing device to; distribute the data identifying criteria and the data analyzing criteria to a decode threshold number of DST units in accordance with the distributed computing criteria, wherein the decode threshold number corresponds to a minimum number of encoded data slices of a set of encoded data slices of the pluralities of sets of encoded data slices that is needed to recover a data segment of a corresponding one of the data objects, and wherein the decode threshold number of DST units is less than a number of the DST units storing the pluralities of sets of encoded data slices; receive a set of network data partial resultants from the decode threshold number of DST units, wherein the decode threshold number of DST units generates the set of network data partial resultants based on searching at least some of the large volume of data objects in accordance with the data identifying criteria, the data analyzing criteria, and the distributed computing criteria; and process the set of network data partial resultants to produce a network data resultant regarding the data on the network. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
Specification