Asymmetric streaming record data processor method and apparatus
First Claim
1. An asymmetric data processor comprising:
- one or more host computers, each including a memory, a network interface and at least one CPU, each host computer being responsive to requests from end users and applications to process data;
one or more Job Processing Units (JPUs), each having a memory, a network interface, one or more storage devices, and at least one CPU, each JPU being responsive to requests from host computers and from other JPUs to process data;
a network enabling the host computers and the JPUs to communicate between and amongst each other, each of the host computers and JPUs forming a respective node on the network; and
a plurality of software operators that allow each node to process data in a record-by-record, streaming fashion in which (i) for each operator in a given sequence of operators, output of the operator is input to a respective succeeding operator in a manner free of necessarily materializing data, and (ii) data processing follows a logical data flow and is based on readiness of a record, such that as soon as a subject record is ready record data is passed for processing from one part to a next part in the logical data flow, the flow of record data during data processing being substantially continuous so as to form a stream of record processing from operator to operator within nodes and across nodes of the network.
8 Assignments
0 Petitions
Accused Products
Abstract
An asymmetric data record processor and method includes host computers and job processing units (JPU'"'"'s) coupled together on a network. Each host computer and JPU forms a node on the network. A plurality of software operators allow each node to process streams of records. For each operator in a given sequence within nodes and across nodes, output of the operator is input to a respective succeeding operator. Data processing follows a logical data flow based on readiness of a record. As soon as a record is ready it is passed for processing from one part to a next part in the logical data flow. The flow of records during data processing is substantially continuous and of a streaming fashion.
-
Citations
47 Claims
-
1. An asymmetric data processor comprising:
-
one or more host computers, each including a memory, a network interface and at least one CPU, each host computer being responsive to requests from end users and applications to process data;
one or more Job Processing Units (JPUs), each having a memory, a network interface, one or more storage devices, and at least one CPU, each JPU being responsive to requests from host computers and from other JPUs to process data;
a network enabling the host computers and the JPUs to communicate between and amongst each other, each of the host computers and JPUs forming a respective node on the network; and
a plurality of software operators that allow each node to process data in a record-by-record, streaming fashion in which (i) for each operator in a given sequence of operators, output of the operator is input to a respective succeeding operator in a manner free of necessarily materializing data, and (ii) data processing follows a logical data flow and is based on readiness of a record, such that as soon as a subject record is ready record data is passed for processing from one part to a next part in the logical data flow, the flow of record data during data processing being substantially continuous so as to form a stream of record processing from operator to operator within nodes and across nodes of the network. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29)
-
-
30. A method of data processing comprising the steps of:
-
providing one or more host computers, each including a memory, a network interface and at least one CPU, each host computer being responsive to requests from end users and applications to process data;
providing one or more Job Processing Units (JPUs), each having a memory, a network interface, one or more storage devices, and at least one CPU, each JPU being responsive to requests from host computers and from other JPUs to process data;
networking the host computers and the JPUs to communicate between and amongst each other, each of the host computers and JPUs forming a respective node on the network; and
using a plurality of software operators, enabling each node to process data in a record-by-record, streaming fashion in which (i) for each operator in a given sequence of said operators, output of the operator is input to a respective succeeding operator in a manner free of necessarily materializing data, and (ii) data processing follows a logical data path formed of node locations and operators and is based on readiness of a record, such that as soon as a subject record is ready, record data is passed from one node location or operator to a next node location or operator for processing along the logical data path, the flow of record data on the logical data path during data processing being substantially continuous so as to form a stream of record processing from operator to operator across nodes and within nodes of the network. - View Dependent Claims (31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47)
-
Specification