PROCESSING DATA ACROSS A DISTRIBUTED NETWORK
First Claim
1. A method for processing data across a distributed network of a plurality of machines, comprising:
- receiving one or more data items;
filtering the one or more data items at a first package;
sorting the filtered one or more data items at a second package; and
summing frequencies related to the sorted one or more data items at a third package;
wherein the first, second, and third packages are distributed across the plurality of machines.
2 Assignments
0 Petitions
Accused Products
Abstract
A system, method, and related techniques are disclosed for processing data across a distributed network to a plurality of machines. The method may include receiving a first user-supplied transform and generating a first package based on the first user-supplied transform. The method may further include receiving a designated key and generating a second package based on the key. Furthermore, the method may include receiving a second user-supplied transform and generating a third package based on the second user-supplied transform. Moreover, the method may include distributing the first, second, and third packages to a plurality of machines using a cluster API.
-
Citations
20 Claims
-
1. A method for processing data across a distributed network of a plurality of machines, comprising:
-
receiving one or more data items; filtering the one or more data items at a first package; sorting the filtered one or more data items at a second package; and summing frequencies related to the sorted one or more data items at a third package; wherein the first, second, and third packages are distributed across the plurality of machines. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method for processing data across a distributed network to a plurality of machines, comprising:
-
receiving a first user-supplied transform; generating a first package based on the first user-supplied transform; receiving a designated key; generating a second package based on the designated key; receiving a second user-supplied transform; generating a third package based on the second user-supplied transform; and distributing the first, second, and third packages to the plurality of machines using a cluster API. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. One or more computer-readable media having computer-usable instructions stored thereon for performing a method for processing data across a distributed network to a plurality of machines, comprising:
-
identifying one or more data items; transmitting a first user-supplied transform to generate a first package; transmitting a designated key to generate a second package; transmitting a second user-supplied transform to generate a third package; receiving output data that has been processed within the first, second, and third packages, wherein the first, second, and third packages are distributed across the plurality of machines. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification