Data management application programming interface failure recovery in a parallel file system
First Claim
1. In a cluster of computing nodes having shared access to one or more volumes of data storage using a parallel file system, a method for managing the data storage, comprising:
- initiating a session of a data management application on a session node selected from among the nodes in the cluster;
receiving an event message in a session queue for processing by the data management application at the session node, responsive to a request submitted to the parallel file system by a user application on a source node among the nodes in the cluster to perform a file operation on a file in the data storage; and
following a failure at the session node, reconstructing the session queue so that processing of the event message by the data management application can continue after recovery from the failure.
3 Assignments
0 Petitions
Accused Products
Abstract
In a cluster of computing nodes having shared access to one or more volumes of data storage using a parallel file system, a method for managing the data storage includes initiating a session of a data management application on a session node selected from among the nodes in the cluster. The session node receives an event message in a session queue for processing by the data management application, responsive to a request submitted to the parallel file system by a source node among the nodes in the cluster to perform a file operation on a file in the data storage. Following a failure at the session node, the session queue is reconstructed so that processing of the event message by the data management application can continue after recovery from the failure, and the request can be fulfilled at the source node.
113 Citations
63 Claims
-
1. In a cluster of computing nodes having shared access to one or more volumes of data storage using a parallel file system, a method for managing the data storage, comprising:
-
initiating a session of a data management application on a session node selected from among the nodes in the cluster;
receiving an event message in a session queue for processing by the data management application at the session node, responsive to a request submitted to the parallel file system by a user application on a source node among the nodes in the cluster to perform a file operation on a file in the data storage; and
following a failure at the session node, reconstructing the session queue so that processing of the event message by the data management application can continue after recovery from the failure. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
22. Computing apparatus, comprising:
-
one or more volumes of data storage, arranged to store data; and
a plurality of computing nodes, linked to access the volumes of data storage using a parallel file system, and arranged so as to enable a data management application to initiate a data management session on a session node selected among the nodes in the cluster, so that when a request is submitted to the parallel file system by a user application on a source node among the nodes in the cluster to perform a file operation on a file in the data storage, an event message is received at the session node responsive to the request, for processing by the data management application, and so that following a failure at the session node, the session queue is reconstructed so that processing of the event message by the data management application can continue after recovery from the failure. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42)
-
- 43. A computer software product for use in a cluster of computing nodes having shared access to one or more volumes of data storage using a parallel file system, the product comprising a computer-readable medium in which program instructions are stored, which instructions, when read by the computing nodes, cause a session of a data management application to be initiated on a session node selected among the nodes in the cluster, such that when a user application on a source node among the nodes in the cluster submits a request to the parallel file system to perform a file operation on a file in the data storage, an event message is received at the session node, for processing by the data management application, and such that following a failure at the session node, the session queue is reconstructed so that processing of the event message by the data management application can continue after recovery from the failure.
Specification