Discoscript: a simplified distributed computing scripting language
First Claim
1. One or more computer storage media comprising computer-executable instructions for providing parallel-processing-capable scripting commands, the computer-executable instructions directed to steps comprising:
- interpreting a process scripting command, specifying a process input data and either a process executable file comprising one or more process functions or a process code block comprising the one or more process functions, to generate computer-executable instructions for applying, in parallel across one or more processes, the one or more process functions to the process input data to generate a process output data;
interpreting a distribute scripting command, specifying at least a distribute input data, to generate computer-executable instructions for dividing, in parallel across one or more processes, the distribute input data into two or more subdivisions representing a distribute output data;
interpreting an aggregate scripting command, specifying multiple aggregate input data, to generate computer-executable instructions for combining, in parallel across one or more processes, the multiple aggregate input data into an aggregate output data;
interpreting a join scripting command, specifying a first and second join input data, both having an equivalent number of segments, to generate computer-executable instructions for combining, in parallel across one or more processes, each segment of the first join input data with a corresponding segment of the second join input data to form a join output data; and
interpreting a cross-product scripting command, specifying a first and second cross-product input data, to generate computer-executable instructions for combining, in parallel across one or more processes, each segment of the first cross-product input data with each segment of the second cross-product input data to form a cross-product output data.
2 Assignments
0 Petitions
Accused Products
Abstract
Scripting core commands and aggregations of such commands are provided to script authors to enable them to generate scripts that can be parallel-processed without requiring the author to be aware of parallel-processing techniques. The scripting core commands and aggregations abstract mechanisms that can be executed in parallel, enabling the script author to focus on higher-level concepts. The scripting core commands provided include commands for applying a function in parallel and distributing and joining data in parallel. For added flexibility, one or more scripting core commands can utilize functions written in a different programming language and referenced appropriately in code blocks.
-
Citations
20 Claims
-
1. One or more computer storage media comprising computer-executable instructions for providing parallel-processing-capable scripting commands, the computer-executable instructions directed to steps comprising:
-
interpreting a process scripting command, specifying a process input data and either a process executable file comprising one or more process functions or a process code block comprising the one or more process functions, to generate computer-executable instructions for applying, in parallel across one or more processes, the one or more process functions to the process input data to generate a process output data; interpreting a distribute scripting command, specifying at least a distribute input data, to generate computer-executable instructions for dividing, in parallel across one or more processes, the distribute input data into two or more subdivisions representing a distribute output data; interpreting an aggregate scripting command, specifying multiple aggregate input data, to generate computer-executable instructions for combining, in parallel across one or more processes, the multiple aggregate input data into an aggregate output data; interpreting a join scripting command, specifying a first and second join input data, both having an equivalent number of segments, to generate computer-executable instructions for combining, in parallel across one or more processes, each segment of the first join input data with a corresponding segment of the second join input data to form a join output data; and interpreting a cross-product scripting command, specifying a first and second cross-product input data, to generate computer-executable instructions for combining, in parallel across one or more processes, each segment of the first cross-product input data with each segment of the second cross-product input data to form a cross-product output data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. One or more computer storage media comprising computer-executable instructions for interpreting a script, the computer-executable instructions directed to steps comprising:
- interpreting the script with reference to a library of computer-executable commands for generating a program that can be executed across multiple processes, the library comprising;
a process command for applying a process function to a process input data in parallel across one or more processes, a distribute command for dividing a distribute input data into two or more subdivisions in parallel across one or more processes, an aggregate command for combining multiple aggregate input data in parallel across one or more processes, a join command for combining a first join input data and a second join input data, both having an equivalent number of segments, such that each segment of the first join input data is combined with a corresponding segment of the second join input data in parallel across one or more processes, and a cross-product command for combining each segment of a first cross-product input data, that was specified as part of the cross-product command, with each segment of a second cross-product input data, that was also specified as part of the cross-product command, in parallel across one or more processes; and
invoking a compiler to compile a process code block comprising the process function, if the script comprises a process scripting command identifying the process code block. - View Dependent Claims (10, 11, 12)
- interpreting the script with reference to a library of computer-executable commands for generating a program that can be executed across multiple processes, the library comprising;
-
13. A method for providing parallel-processing-capable commands, the method comprising the steps of:
-
interpreting a process scripting command, specifying a process input data and either a process executable file comprising one or more process functions or a process code block comprising the one or more process functions, to generate computer-executable instructions for applying, in parallel across one or more processes, the one or more process functions to the process input data to generate a process output data; interpreting a distribute scripting command, specifying at least a distribute input data, to generate computer-executable instructions for dividing, in parallel across one or more processes, the distribute input data into two or more subdivisions representing a distribute output data; interpreting an aggregate scripting command, specifying multiple aggregate input data, to generate computer-executable instructions for combining, in parallel across one or more processes, the multiple aggregate input data into an aggregate output data; interpreting a join scripting command, specifying a first and second join input data, both having an equivalent number of segments, to generate computer-executable instructions for combining, in parallel across one or more processes, each segment of the first join input data with a corresponding segment of the second join input data to form a join output data; and interpreting a cross-product scripting command, specifying a first and second cross-product input data, to generate computer-executable instructions for combining, in parallel across one or more processes, each segment of the first cross-product input data with each segment of the second cross-product input data to form a cross-product output data. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification