Resource management in a clustered computer system
First Claim
1. A method for managing resources in a cluster, the method comprising the computer-implemented steps of:
- unlocking a global queue of resources which is guarded by a lock;
updating the unlocked global queue;
locking the updated global queue; and
updating a local queue of resources while executing an interrupt handler.
10 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and devices are provided for managing resources in a computing cluster. The managed resources include cluster nodes themselves, as well as sharable resources such as memory buffers and bandwidth credits that may be used by one or more nodes. Resource management includes detecting failures and possible failures by node software, node hardware, interconnects, and system area network switches and taking steps to compensate for failures and prevent problems such as uncoordinated access to a shared disk. Resource management also includes reallocating sharable resources in response to node failure, demands by application programs, or other events. Specific examples provided include failure detection by remote memory probes, emergency communication through a shared disk, and sharable resource allocation with minimal locking.
-
Citations
20 Claims
-
1. A method for managing resources in a cluster, the method comprising the computer-implemented steps of:
-
unlocking a global queue of resources which is guarded by a lock;
updating the unlocked global queue;
locking the updated global queue; and
updating a local queue of resources while executing an interrupt handler. - View Dependent Claims (2, 3, 4)
-
-
5. A computer system comprising:
-
at least two interconnected nodes capable of presenting a uniform system image such that an application program views the interconnected nodes as a single computing platform; and
a management means for managing computational resources for use by the nodes, wherein the management means comprises a queue and lock management means for managing access to a global and local groups of sharable resources using a single lock and at least one interrupt handler. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20)
-
-
18. A computer storage medium having a configuration that represents data and instructions which will cause at least a portion of a computer system to perform method steps for managing resources in a cluster computing system, the method steps comprising the steps of unlocking a global queue of resources which is guarded by a lock, updating the unlocked global queue, locking the updated global queue, and updating a local queue of resources while executing an interrupt handler.
Specification