Parallel priority queue utilizing parallel heap on many-core processors for accelerating priority-queue-based applications
First Claim
Patent Images
1. A system, comprising:
- a host processor comprising a central processing unit (CPU);
a graphics processing unit (GPU) comprising a many-core architecture;
kernel code executable by the GPU that, when executed by the GPU, causes the GPU to;
execute a parallel heap manager and a priority queue application concurrently in the GPU by assigning at least one kernel function of the parallel heap manager to a first stream and at least one kernel function of the priority queue application to a second stream;
implement, by the priority queue application in the first stream, a priority queue as a parallel heap where a plurality of operations performed on the priority queue are performed in parallel on the GPU; and
maintain, by the parallel heap manager in the second stream, an order of priority as a plurality of queue entries are inserted and deleted from the priority queue; and
host code executable by the host processor that, when executed, causes the host processor to;
synchronize, by a controller implemented by the host processor, operations of the priority queue application and the parallel heap manager using a global barrier.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed are various embodiments for a parallel priority queue implemented on one or more many-core processors and/or multi-core processors such as those in general-purpose graphics processing units (GPGPUs). According to various embodiments, a priority may be determined according to a timestamp of an item, such as an event or an entry, in a priority queue. A priority queue interface may comprise functions to insert and remove entries from the priority queue. Priority order of the entries may be maintained as the entries are inserted and removed from the queue.
8 Citations
20 Claims
-
1. A system, comprising:
-
a host processor comprising a central processing unit (CPU); a graphics processing unit (GPU) comprising a many-core architecture; kernel code executable by the GPU that, when executed by the GPU, causes the GPU to; execute a parallel heap manager and a priority queue application concurrently in the GPU by assigning at least one kernel function of the parallel heap manager to a first stream and at least one kernel function of the priority queue application to a second stream; implement, by the priority queue application in the first stream, a priority queue as a parallel heap where a plurality of operations performed on the priority queue are performed in parallel on the GPU; and maintain, by the parallel heap manager in the second stream, an order of priority as a plurality of queue entries are inserted and deleted from the priority queue; and host code executable by the host processor that, when executed, causes the host processor to; synchronize, by a controller implemented by the host processor, operations of the priority queue application and the parallel heap manager using a global barrier. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method, comprising:
-
implementing, by a graphics processing unit (GPU) comprising a plurality of streaming multi-core processors, a parallel heap manager and a priority queue application concurrently in the GPU, wherein at least one kernel function of the parallel heap manager is assigned to a first stream and at least one kernel function of the priority queue application is assigned to a second stream; implementing, by the priority queue application in the first stream, a priority queue as a parallel heap where a plurality of operations performed on the priority queue are performed in parallel; providing, by a host processor comprising a central processing unit (CPU) in communication with the GPU, a programmatic interface for retrieving, inserting, and deleting a plurality of queue entries in the parallel heap; maintaining, by a controller executed in the host processor, a priority order as the plurality of queue entries are inserted and deleted from the priority queue; and synchronizing, by the controlled executed in the host processor, a plurality of operations performed by the parallel heap manager and the priority queue application on the parallel heap of the GPU using a global barrier. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20)
-
Specification