System and Method for NUMA-Aware Locking Using Lock Cohorts

US 20130290583A1
Filed: 04/27/2012
Published: 10/31/2013
Est. Priority Date: 04/27/2012
Status: Active Grant

First Claim

Patent Images

1. A method, comprising:

performing by a computer;

beginning execution of a multithreaded application that comprises one or more requests to acquire a shared lock, wherein the shared lock controls access to a critical section of code or a shared resource by concurrently executing threads of the application, and wherein only one thread can hold the shared lock at a time;

a thread of the application acquiring the shared lock, wherein the thread is executing on one of a plurality of processor cores in a cluster of processor cores that share a memory, and wherein the cluster of processor cores is one of a plurality of clusters of processor cores on which threads of the multithreaded application are executing;

in response to acquiring the shared lock, the thread;

accessing the critical section of code or shared resource; and

subsequent to said accessing;

determining whether any other threads of the application that are executing on a processor core in the cluster of processor cores are waiting to access the critical section of code or shared resource; and

in response to determining that at least one other thread of the application that is executing on a processor core in the cluster of processor cores is waiting to acquire the shared lock, passing ownership of a cluster-specific lock that is associated with the critical section of code or shared resource to another thread of the application that is executing on a processor core in the cluster of processor cores and that is waiting to access the critical section of code or shared resource without releasing the shared lock, wherein said passing allows the other thread to gain access to the critical section of code or shared resource.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The system and methods described herein may be used to implement NUMA-aware locks that employ lock cohorting. These lock cohorting techniques may reduce the rate of lock migration by relaxing the order in which the lock schedules the execution of critical code sections by various threads, allowing lock ownership to remain resident on a single NUMA node longer than under strict FIFO ordering, thus reducing coherence traffic and improving aggregate performance. A NUMA-aware cohort lock may include a global shared lock that is thread-oblivious, and multiple node-level locks that provide cohort detection. The lock may be constructed from non-NUMA-aware components (e.g., spin-locks or queue locks) that are modified to provide thread-obliviousness and/or cohort detection. Lock ownership may be passed from one thread that holds the lock to another thread executing on the same NUMA node without releasing the global shared lock.

30 Citations

20 Claims

1. A method, comprising:
- performing by a computer;
  
  beginning execution of a multithreaded application that comprises one or more requests to acquire a shared lock, wherein the shared lock controls access to a critical section of code or a shared resource by concurrently executing threads of the application, and wherein only one thread can hold the shared lock at a time;
  
  a thread of the application acquiring the shared lock, wherein the thread is executing on one of a plurality of processor cores in a cluster of processor cores that share a memory, and wherein the cluster of processor cores is one of a plurality of clusters of processor cores on which threads of the multithreaded application are executing;
  
  in response to acquiring the shared lock, the thread;
  
  accessing the critical section of code or shared resource; and
  
  subsequent to said accessing;
  
  determining whether any other threads of the application that are executing on a processor core in the cluster of processor cores are waiting to access the critical section of code or shared resource; and
  
  in response to determining that at least one other thread of the application that is executing on a processor core in the cluster of processor cores is waiting to acquire the shared lock, passing ownership of a cluster-specific lock that is associated with the critical section of code or shared resource to another thread of the application that is executing on a processor core in the cluster of processor cores and that is waiting to access the critical section of code or shared resource without releasing the shared lock, wherein said passing allows the other thread to gain access to the critical section of code or shared resource.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1,wherein the method further comprises, prior to acquiring the shared lock, the thread acquiring ownership of the cluster-specific lock;
    - andwherein said acquiring the shared lock is performed in response to the thread acquiring ownership of the cluster-specific lock.
  - 3. The method of claim 1, further comprising, subsequent to said passing, the other thread accessing the critical section of code or shared resource.
  - 4. The method of claim 1, further comprising, subsequent to said passing, the other thread releasing the cluster-specific lock.
  - 5. The method of claim 1, further comprising, subsequent to said passing, the other thread releasing the shared lock.
  - 6. The method of claim 1, further comprising a thread executing on a processor core in another cluster of processor cores acquiring the shared lock and accessing the critical section of code or shared resource.
  - 7. The method of claim 1, wherein said passing comprises updating an indicator to indicate that the other thread is the owner of the cluster-specific lock.
  - 8. The method of claim 1, further comprising, subsequent to said passing, the other thread passing ownership of the cluster-specific lock to yet another thread of the application that is executing on a processor core in the cluster of processor cores and that is waiting to access the critical section of code or shared resource without releasing the shared lock.
  - 9. The method of claim 1, wherein said acquiring the shared lock comprises:
    - attempting to acquire the shared lock; and
      
      in response to failing to acquire the shared lock;
      
      acquiring ownership of the cluster-specific lock; and
      
      in response to acquiring ownership of the cluster-specific lock, repeating said attempting to acquire the shared lock one or more times until an attempt to acquire the shared lock is successful.
  - 10. The method of claim 1, wherein at least one of the shared lock and the cluster-specific lock comprises a spin-type lock, a ticket-based lock, a queue-based lock, a test-and-test-and-set lock, or a back-off lock.
  - 11. The method of claim 1, wherein the shared lock and one or more cluster-specific locks comprise non-NUMA-aware locks that collectively implement a NUMA-aware composite lock usable to manage access to the critical section of code or shared resource.

12. A system, comprising:
- a plurality of processor core clusters, each of which comprises two or more processor cores that support multithreading and that share a local memory;
  
  a system memory coupled to the plurality of processor core clusters;
  
  wherein the system memory stores program instructions that when executed on one or more processor cores in the plurality of processor core clusters cause the one or more processor cores to perform;
  
  beginning execution of a multithreaded application that comprises one or more requests to acquire a shared lock, wherein the shared lock controls access to a critical section of code or a shared resource by concurrently executing threads of the application, and wherein only one thread can hold the shared lock at a time;
  
  a thread of the application acquiring the shared lock, wherein the thread is executing on one of a plurality of processor cores in a cluster of processor cores that share a memory, and wherein the cluster of processor cores is one of two or more clusters of processor cores on which threads of the multithreaded application are executing;
  
  in response to acquiring the shared lock, the thread;
  
  accessing the critical section of code or shared resource; and
  
  subsequent to said accessing;
  
  determining whether any other threads of the application that are executing on a processor core in the cluster of processor cores are waiting to access the critical section of code or shared resource; and
  
  in response to determining that at least one other thread of the application that is executing on a processor core in the cluster of processor cores is waiting to acquire the shared lock, passing ownership of a cluster-specific lock that is associated with the critical section of code or shared resource to another thread of the application that is executing on a processor core in the cluster of processor cores and that is waiting to access the critical section of code or shared resource without releasing the shared lock, wherein said passing allows the other thread to gain access to the critical section of code or shared resource.
- View Dependent Claims (13, 14, 15, 16)
- - 13. The system of claim 12,wherein when executed on the one or more processor cores in the plurality of processor core clusters, the program instructions further cause the one or more processor cores to perform, prior to acquiring the shared lock, the thread acquiring ownership of the cluster-specific lock;
    - andwherein said acquiring the shared lock is performed in response to the thread acquiring ownership of the cluster-specific lock.
  - 14. The system of claim 12, wherein when executed on the one or more processor cores in the plurality of processor core clusters, the program instructions further cause the one or more processor cores to perform, subsequent to said passing:
    - the other thread performing one or more of;
      
      accessing the critical section of code or shared resource;
      
      releasing the shared lock;
      
      orreleasing the cluster-specific lock.
  - 15. The system of claim 12, wherein when executed on the one or more processor cores in the plurality of processor core clusters, the program instructions further cause the one or more processor cores to perform, subsequent to said passing, the other thread passing ownership of the cluster-specific lock to yet another thread of the application that is executing on a processor core in the cluster of processor cores and that is waiting to access the critical section of code or shared resource without releasing the shared lock.
  - 16. The system of claim 12, wherein said acquiring the shared lock comprises:
    - attempting to acquire the shared lock; and
      
      in response to failing to acquire the shared lock;
      
      acquiring ownership of the cluster-specific lock; and
      
      in response to acquiring ownership of the cluster-specific lock, repeating said attempting to acquire the shared lock one or more times until an attempt to acquire the shared lock is successful.

17. A non-transitory, computer-readable storage medium storing program instructions that when executed on one or more computers cause the one or more computers to perform:
- beginning execution of a multithreaded application that comprises one or more requests to acquire a shared lock, wherein the shared lock controls access to a critical section of code or a shared resource by concurrently executing threads of the application, and wherein only one thread can hold the shared lock at a time;
  
  a thread of the application acquiring the shared lock, wherein the thread is executing on one of a plurality of processor cores in a cluster of processor cores that share a memory, and wherein the cluster of processor cores is one of a plurality of clusters of processor cores on which threads of the multithreaded application are executing;
  
  in response to acquiring the shared lock, the thread;
  
  accessing the critical section of code or shared resource; and
  
  subsequent to said accessing;
  
  determining whether any other threads of the application that are executing on a processor core in the cluster of processor cores are waiting to access the critical section of code or shared resource; and
  
  in response to determining that at least one other thread of the application that is executing on a processor core in the cluster of processor cores is waiting to acquire the shared lock, passing ownership of a cluster-specific lock that is associated with the critical section of code or shared resource to another thread of the application that is executing on a processor core in the cluster of processor cores and that is waiting to access the critical section of code or shared resource without releasing the shared lock, wherein said passing allows the other thread to gain access to the critical section of code or shared resource.
- View Dependent Claims (18, 19, 20)
- - 18. The non-transitory, computer-readable storage medium of claim 17,wherein when executed on the one or more computers, the program instructions further cause the one or more computers to perform, prior to acquiring the shared lock, the thread acquiring ownership of the cluster-specific lock;
    - andwherein said acquiring the shared lock is performed in response to the thread acquiring ownership of the cluster-specific lock.
  - 19. The non-transitory, computer-readable storage medium of claim 17, wherein when executed on the one or more computers, the program instructions further cause the one or more computers to perform, subsequent to said passing:
    - the other thread performing one or more of;
      
      accessing the critical section of code or shared resource;
      
      releasing the shared lock;
      
      orreleasing the cluster-specific lock.
  - 20. The non-transitory, computer-readable storage medium of claim 17, wherein when executed on the one or more computers, the program instructions further cause the one or more computers to perform, subsequent to said passing, the other thread passing ownership of the cluster-specific lock to yet another thread of the application that is executing on a processor core in the cluster of processor cores and that is waiting to access the critical section of code or shared resource without releasing the shared lock.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Oracle International Corporation (Oracle Corporation)
Original Assignee
Oracle International Corporation (Oracle Corporation)
Inventors
Dice, David, Marathe, Virendra J., Shavit, Nir N.

Granted Patent

US 8,694,706 B2
Time in Patent Office

Days
Field of Search
US Class Current

710/200
CPC Class Codes

G06F 9/526 Mutual exclusion algorithms

System and Method for NUMA-Aware Locking Using Lock Cohorts

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

30 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

System and Method for NUMA-Aware Locking Using Lock Cohorts

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

30 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links