Opportunistic task threading in a shared-memory, multi-processor computer system

US 5,404,521 A
Filed: 10/05/1993
Issued: 04/04/1995
Est. Priority Date: 07/31/1990
Status: Expired due to Term

First Claim

Patent Images

1. A method for parallel processing implemented by a computer having a plurality of processors including a main processor for executing a main process and at least one parallel needle processor for executing threads initiated by the main process, the computer also having a memory shared by the plurality of processors, wherein the execution time of the main process is reduced by decreasing the overhead associated with separation from the main process of a plurality of separable threads which are executed in parallel by the plurality of processors when available, the method comprising:

(a) determining, by the main processor, if said at least one parallel needle processor is available to execute a first thread;

(b) reserving, by the main processor, the exclusive right to use the parallel needle processor responsive to the parallel needle processor being determined to be available, and executing the first thread on the main processor responsive to the parallel needle processor not being available;

(c) constructing a packaging data structure including the first thread, by the main processor, and transferring the packaging data structure for execution on the reserved needle processor responsive to the reservation being successful;

(d) creating, by the main processor, a future object in the main process while the first thread is being executed on the reserved needle processor so as to allow the main process to continue execution prior to obtaining the result;

(e) utilizing, by the main processor, the future object in the main process as if the future object were the result of the execution of the first thread;

(f) returning a result of the execution of the first thread to the memory so as to resolve the future object.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Method and apparatus are provided in a shared memory, multi-processor computer system for reducing the time required to run an application program on the processors of the system by reducing the overhead associated with the separation of the program and the initiation of the parallel execution of the tasks. The system has a plurality of processors wherein the application program is separated into different tasks and the tasks are executed in parallel on the processors of the system. The system further includes a process enabling the execution of either opportunistic or queued threads. In the preferred embodiment, the method includes the steps of (a) determining if one of the processors is free to execute a first task, and (b) performing the first task if step (a) determines that none of the processors are free. The method also includes the steps of (c) reserving the one processor for the first task if step (a) determines the one processor is free, and (d) constructing and transferring a task data structure for the first task to the reserved processor. Finally, the method includes the steps of (e) creating a future object for the first task, (f) performing the first task on the one processor, and (g) placing the results of step (f) in the future object. An alternative embodiment includes the ability to stack or queue threads onto a Global Queue to await execution by a free processor.

Citations

18 Claims

1. A method for parallel processing implemented by a computer having a plurality of processors including a main processor for executing a main process and at least one parallel needle processor for executing threads initiated by the main process, the computer also having a memory shared by the plurality of processors, wherein the execution time of the main process is reduced by decreasing the overhead associated with separation from the main process of a plurality of separable threads which are executed in parallel by the plurality of processors when available, the method comprising:
- (a) determining, by the main processor, if said at least one parallel needle processor is available to execute a first thread;
  
  (b) reserving, by the main processor, the exclusive right to use the parallel needle processor responsive to the parallel needle processor being determined to be available, and executing the first thread on the main processor responsive to the parallel needle processor not being available;
  
  (c) constructing a packaging data structure including the first thread, by the main processor, and transferring the packaging data structure for execution on the reserved needle processor responsive to the reservation being successful;
  
  (d) creating, by the main processor, a future object in the main process while the first thread is being executed on the reserved needle processor so as to allow the main process to continue execution prior to obtaining the result;
  
  (e) utilizing, by the main processor, the future object in the main process as if the future object were the result of the execution of the first thread;
  
  (f) returning a result of the execution of the first thread to the memory so as to resolve the future object.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1 wherein the shared memory contains a first data structure for indicating the status of the at least one needle, and wherein step (a) comprises examining, by the computer, the first data structure so as to determine if a parallel needle is available.
  - 3. The method of claim 2 wherein said step (b) comprises reserving the parallel needle processor by modifying the first data structure utilizing a single machine instruction.
  - 4. The method of claim 3 further comprising:
    - initializing, by the main processor, a needle process on the reserved parallel needle processor;
      
      executing, by the needle process, a thread which has been allocated to the reserved parallel needle processor as indicated by the first data structure, so as to determine a result for the future object; and
      
      modifying, by the main processor, the first data structure utilizing a single machine instruction so as to indicate the reserved parallel needle processor is available to execute another thread.
  - 5. The method of claim 4 wherein the needle process is initialized prior to the start of the main process.
  - 6. The method of claim 3 further comprising:
    - performing said step (c) only if the reserved parallel needle processor has been reserved for the first thread and not for a different thread; and
      
      returning to said step (a) if the reserved parallel needle processor has not been reserved for the first thread.
  - 7. The method of claim 1 wherein the step of packaging comprises constructing, by the computer, a second data structure in the shared memory that includes at least a name for the first thread and at least one argument to be operated on by the first thread.
  - 8. The method of claim 7 wherein constructing the second data structure comprises occupying only eight bytes of the shared memory.
  - 9. The method of claim 1 wherein the shared memory includes a queued threading process and an opportunistic threading process, the method further comprising:
    - determining, by the main processor, whether to execute the opportunistic threading process prior to determining if a parallel needle processor is available; and
      
      performing said step (a) if the above step determines that the opportunistic threading process should be performed.

10. A method for parallel processing implemented by a computer having a plurality of processors including a main processor for executing a main process and at least one parallel needle processor for executing threads initiated by the main process, the computer also having a memory shared by the plurality of processors including a portion designated as a thread queue, wherein the execution time of the main process is reduced by decreasing the overhead associated with separation from the main process of a plurality of separable threads which are executed in parallel by the plurality of processors, the method comprising:
- (a) constructing a packaging data structure including a first thread, by the main processor, for execution on a parallel needle processor;
  
  (b) creating, by the main processor, a future object in the main process for the result while the first thread is being executed so as to allow the main process to continue execution prior to obtaining the result;
  
  (c) obtaining, by the main processor, a lock for the thread queue so as to prevent modification of the thread queue by a competing process;
  
  (d) placing, by the main processor, the first thread in the thread queue;
  
  (e) releasing, by the main processor, the lock for the thread queue so as to allow modification of the thread queue;
  
  (f) utilizing, by the main processor, the future object in the main process as if the future object were the result of the thread execution;
  
  (g) initializing, by the main processor, a needle process on the parallel needle processor;
  
  (h) obtaining, by the main processor, the lock for the thread queue so as to prevent modification of the thread queue by a competing process;
  
  (i) removing, by the main processor, the first thread from the thread queue;
  
  (j) releasing, by the main processor, the lock for the thread queue so as to allow modification of the thread queue; and
  
  (k) executing, by the needle process, the removed first thread so as to determine and return the result for resolving the future object.
- View Dependent Claims (11, 12, 13)
- - 11. The method of claim 10 wherein releasing the lock comprises a single machine instruction.
  - 12. The method of claim 10 further comprising repeating the steps (g) through (k) until no threads remain in the thread queue.
  - 13. The method of claim 10 wherein the shared memory further includes a queued threading process and an opportunistic threading process, the method further comprising:
    - determining, by the main processor, whether to execute the opportunistic threading process before performing said step (a); and
      
      proceeding, by the main processor, with said step (a) only if the above step determines the opportunistic threading process should be executed.

14. Apparatus for parallel processing implemented by a computer having a plurality of processors including a main processor for executing a main process and at least one parallel needle processor for executing threads initiated by the main process, the computer also having a memory shared by the plurality of processors, wherein the execution time of the main process is reduced by decreasing the overhead associated with separation from the main process a plurality of separable threads which are executed in parallel by the plurality of processors when available, the apparatus comprising:
- means for determining if a parallel needle processor is available to execute a first thread;
  
  means for reserving an exclusive right to use the at least one parallel needle processor if the at least one parallel needle processor is determined to be available, and means for executing the first thread on the main processor if the at least one parallel needle processor is determined to be unavailable;
  
  means for creating a packaging data structure including the first thread for execution on the reserved parallel needle processor responsive to the reservation being successful;
  
  means for transferring the packaging data structure to the reserved parallel needle processor;
  
  means for creating a future object in the main process while the first thread is being executed on the reserved parallel needle processor so as to allow the main process to continue execution prior to obtaining a result for the future object; and
  
  means for utilizing the future object in the main process as if the future object were the result of the execution of the first thread means for returning a result of the execution of the first thread to the memory so as to resolve the future object; and
- View Dependent Claims (15, 16, 17, 18)
- - 15. The apparatus of claim 14 wherein the shared memory contains a first data structure for indicating the status of the at least one needle processor and the means for determining comprises means for examining the first data structure so as to determine if a parallel needle processor is available.
  - 16. The apparatus of claim 14 wherein the means for reserving comprises means for modifying the first data structure utilizing a single machine instruction.
  - 17. The apparatus of claim 16 further comprising:
    - means for initializing a needle process on the reserved parallel needle processor;
      
      means for executing a thread which has been allocated to the reserved parallel needle processor as indicated by the first data structure, so as to determine the result of the future object; and
      
      means for modifying the first data structure utilizing a single machine instruction so as to indicate the reserved parallel needle processor is available to execute another thread.
  - 18. The apparatus of claim 17 wherein the means for initializing is operative to initialize the needle process prior to the start of the main process.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Top Level Incorporated
Original Assignee
Top Level Incorporated
Inventors
Murray, Kelly E.
Primary Examiner(s)
Chan, Eddie P.

Application Number

US08/133,958
Time in Patent Office

546 Days
Field of Search

395/650, 395/375
US Class Current

718/104
CPC Class Codes

G06F 8/451 Code distribution consideri...

G06F 9/505 considering the load

Opportunistic task threading in a shared-memory, multi-processor computer system

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Opportunistic task threading in a shared-memory, multi-processor computer system

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links