Formal Language and Translator for Parallel Processing of Data
First Claim
1. A method, comprising:
- accepting as input a program written in a formal language, wherein the formal language allows a declarative co-grouping of one or more tables, each with an alignment function, and a specification of zero or more procedural operations to be performed on each resulting co-group;
translating the program into one or more jobs, wherein each job comprises one or more structured calls to an application programming interface for encoded logic that is operable to generate a plurality of tasks for the parallel processing of the job on one or more data processing devices in a distributed system.
11 Assignments
0 Petitions
Accused Products
Abstract
The present invention, in an example embodiment, provides a special-purpose formal language and translator for the parallel processing of large databases in a distributed system. The special-purpose language has features of both a declarative programming language and a procedural programming language and supports the co-grouping of tables, each with an arbitrary alignment function, and the specification of procedural operations to be performed on the resulting co-groups. The language'"'"'s translator translates a program in the language into optimized structured calls to an application programming interface for implementations of functionality related to the parallel processing of tasks over a distributed system. In an example embodiment, the application programming interface includes interfaces for MapReduce functionality, whose implementations are supplemented by the embodiment.
72 Citations
20 Claims
-
1. A method, comprising:
-
accepting as input a program written in a formal language, wherein the formal language allows a declarative co-grouping of one or more tables, each with an alignment function, and a specification of zero or more procedural operations to be performed on each resulting co-group; translating the program into one or more jobs, wherein each job comprises one or more structured calls to an application programming interface for encoded logic that is operable to generate a plurality of tasks for the parallel processing of the job on one or more data processing devices in a distributed system. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. Logic encoded in one or more tangible media for execution and when executed is operable to:
-
accept as input a program written in a formal language, wherein the formal language allows a declarative co-grouping of one or more tables, each with an alignment function, and a specification of zero or more procedural operations to be performed on each resulting co-group; translate the program into one or more jobs, wherein each job comprises one or more structured calls to an application programming interface for encoded logic that is operable to generate a plurality of tasks for the parallel processing of the job on one or more data processing devices in a distributed system. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. An apparatus comprising:
-
means to accept as input a program written in a formal language, wherein the formal language allows a declarative co-grouping of one or more tables, each with an alignment function, and a specification of zero or more procedural operations to be performed on each resulting co-group; and means to translate the program into one or more jobs, wherein each job comprises one or more structured calls to an application programming interface for encoded logic that is operable to generate a plurality of tasks for the parallel processing of the job on one or more data processing devices in a distributed system; means to assign the tasks to data-processing devices in a distributed system; and means to process the tasks in parallel. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification