Parallel processing method and system using a lazy parallel data type to reduce inter-processor communication
First Claim
Patent Images
1. A parallel processing method for executing a parallel program comprising:
- subdividing a group of processors into subgroups, with one sub-group per parallel function call;
distributing elements of a collection oriented data type across the processors in a group, such that each of the processors has local elements of the collection oriented data type, and wherein the processors in the group access different local elements of the same collection oriented data type, the local elements being subportions that together form the collection oriented data type;
in each processor, performing data parallel operations locally on the local elements of the processor;
deferring distribution of the local elements on each processor to another processor until an operation is performed on the collection oriented data type requiring that the elements be re-distributed.
2 Assignments
0 Petitions
Accused Products
Abstract
A parallel programming system provides a lazy collection oriented data type that reduces inter-processor communication in programs executed on parallel computers. The lazy collection oriented data type is provided as a data type in a parallel programming language. The parallel language supports both data-parallel and control-parallel operations. These operations take advantage of the lazy collection oriented data type to defer or reduce inter-processor communication until an operation on the data type requires that it be balanced across a set of processors.
206 Citations
19 Claims
-
1. A parallel processing method for executing a parallel program comprising:
-
subdividing a group of processors into subgroups, with one sub-group per parallel function call;
distributing elements of a collection oriented data type across the processors in a group, such that each of the processors has local elements of the collection oriented data type, and wherein the processors in the group access different local elements of the same collection oriented data type, the local elements being subportions that together form the collection oriented data type;
in each processor, performing data parallel operations locally on the local elements of the processor;
deferring distribution of the local elements on each processor to another processor until an operation is performed on the collection oriented data type requiring that the elements be re-distributed. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
when making a function call on the collection oriented data type, checking the identifier to determine whether the collection oriented data type is not balanced, and if not, redistributing the elements of the collection oriented data type across the processors so that the elements of the collection oriented data type are balanced across processors before executing the function call.
-
-
3. The method of claim 2 wherein the collection oriented data type is represented as a data structure associated with each of the processors, and the structure associated with each processor includes a field indicating length, a field indicating number of elements locally managed by the processor, a field indicating a pointer to the local elements stored in memory associated with the processor, and a flag indicating whether the collection oriented data type is in a lazy state.
-
4. The method of claim 1 further including:
while executing data parallel operations on the collection oriented data type, deferring re-distributing the elements of the collection oriented data type among the processors until the program recurses and invokes a split function to split the program into two or more recursive function calls, each recursive function call processed by a sub-group of processors.
-
5. The method of claim 1 further including:
-
while executing data parallel operations on the collection oriented data type, marking the collection oriented data type as unbalanced when the processors perform operations on respective elements of the data type that cause an unequal number of elements for each processor to be created as the result of the operation;
deferring re-distributing the elements of the collection oriented data type among the processors until the program invokes a function on the collection oriented data type that requires the collection oriented data type to be in a balanced state.
-
-
6. The method of claim 1 wherein the program comprises recursive function calls forming a recursion tree, the tree having a top node representing a first recursive call and subsequent child nodes representing recursive calls on the collection oriented data type, and further comprising:
-
on each recursive function call down the recursion tree, re-distributing the collection oriented data type to two or more processors;
upon returning up the recursion tree, deferring re-distributing the elements of the collection oriented data type among the processors until the top of the tree is reached.
-
-
7. The method of claim 6 wherein the program includes an append operation to append results of recursive function calls, and further comprising:
upon returning up the recursion tree, deferring appending elements of the collection oriented data type managed locally by each of the processors until the top of the tree is reached.
-
8. A computer readable medium having instructions for performing the steps of claim 1.
-
9. A parallel processing method for executing a parallel program comprising:
-
distributing elements of a collection oriented data type across processors in a parallel computer, such that each of the processors has local elements of the collection oriented data type;
in each processor, performing data parallel operations locally on the local elements of the processor;
marking the collection oriented data type as unbalanced when a data parallel operation causes the elements to be unbalanced across the processors;
when making a function call on the collection oriented data type, checking to determine whether the collection oriented data type is not balanced, and if not, redistributing the elements of the collection oriented data type across the processors so that the elements of the collection oriented data type are balanced across processors before executing the function call; and
when redistribution is required, inserting programming code into the parallel program to redistribute the elements before the function call occurs. - View Dependent Claims (10)
-
-
11. A computer readable medium comprising:
-
a preprocessor for converting a parallel program written in a programming language with control parallel and data parallel operations to an imperative, sequential programming code and calls to a message passing interface for inter-processor communication on a parallel computer;
wherein the preprocessor is operable to insert a check function into the translated program to test whether a collection oriented data type is in a lazy state and to insert code for redistributing the collection oriented data type when it is in a lazy state;
a compiler for compiling the imperative programming code and the calls to the message passing interface generated by the preprocessor;
a message passing interface run-time library for performing the calls to the message passing interface;
a parallel language run-time library including code for redistributing the collection oriented data type when it is in a lazy state; and
a linker for linking the compiled imperative programming code and calls to the message passing interface with the message passing interface run-time library and the parallel language run-time library.
-
-
12. A parallel processing method for executing a parallel program comprising:
-
subdividing a group of processors into subgroups, with one sub-group per parallel function call;
distributing elements of a collection oriented data type across the processors in a group, such that each of the processors has local elements of the collection oriented data type;
in each processor, performing data parallel operations locally on the local elements of the processor;
deferring distribution of the local elements on each processor to another processor until an operation is performed on the collection oriented data type requiring that the elements be re-distributed;
translating the parallel program into imperative programming code and calls to a message passing interface;
for an operation on the collection oriented data type that require a re-distribution of the elements across processors, inserting programming code into the translated program for redistributing the elements before the operation is performed. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
when making a function call on the collection oriented data type, checking the identifier to determine whether the collection oriented data type is not balanced, and if not, redistributing the elements of the collection oriented data type across the processors so that the elements of the collection oriented data type are balanced across processors before executing the function call.
-
-
14. The method of claim 13 wherein the collection oriented data type is represented as a data structure associated with each of the processors, and the structure associated with each processor includes a field indicating length, a field indicating number of elements locally managed by the processor, a field indicating a pointer to the local elements stored in memory associated with the processor, and a flag indicating whether the collection oriented data type is in a lazy state.
-
15. The method of claim 12 further including:
while executing data parallel operations on the collection oriented data type, deferring re-distributing the elements of the collection oriented data type among the processors until the program recurses and invokes a split function to split the program into two or more recursive function calls, each recursive function call processed by a sub-group of processors.
-
16. The method of claim 12 further including:
-
while executing data parallel operations on the collection oriented data type, marking the collection oriented data type as unbalanced when the processors perform operations on respective elements of the data type that cause an unequal number of elements for each processor to be created as the result of the operation;
deferring re-distributing the elements of the collection oriented data type among the processors until the program invokes a function on the collection oriented data type that requires the collection oriented data type to be in a balanced state.
-
-
17. The method of claim 12 wherein the program comprises recursive function calls forming a recursion tree, the tree having a top node representing a first recursive call and subsequent child nodes representing recursive calls on the collection oriented data type, and further comprising:
-
on each recursive function call down the recursion tree, re-distributing the collection oriented data type to two or more processors;
upon returning up the recursion tree, deferring re-distributing the elements of the collection oriented data type among the processors until the top of the tree is reached.
-
-
18. The method of claim 17 wherein the program includes an append operation to append results of recursive function calls, and further comprising:
upon returning up the recursion tree, deferring appending elements of the collection oriented data type managed locally by each of the processors until the top of the tree is reached.
-
19. A computer readable medium having instructions for performing the steps of claim 13.
Specification