Parallel I/O read processing for use in clustered file systems having cache storage
First Claim
1. A computer program product, comprising a computer readable storage medium having program code embodied therewith, the embodied program code executable by a processor to cause the processor to:
- determine, by the processor, a home node that corresponds to gateway (GW) nodes in a clustered file system, each GW node being eligible to process one or more read tasks;
determine, by the processor, a peer GW eligibility value for more than one of the GW nodes in the clustered file system eligible to process the one or more read tasks;
determine, by the processor, a single GW node from amongst the GW nodes having a highest peer GW eligibility value for each home node;
assign and define, by the processor, a size for one or more read task items for the GW nodes having the highest peer GW eligibility value for multiple home nodes based on a current dynamic profile of the GW nodes;
distribute, by the processor, workload to the GW nodes according to the size for each of the one or more read task items assigned to the GW nodes;
receive, by the processor, a remote block read request for an uncached file from an application node;
determine, by the processor, whether the remote block read request is sent in response to a foreground read request or initiation of a manual file prefetch;
determine, by the processor, whether a size of the requested file exceeds a parallel read threshold in response to a determination that the remote block read request is sent in response to the manual file prefetch and invoke a parallel input/output (I/O) code path with handling of sparse blocks and small read scenarios in response to a determination that the size of the requested file exceeds the parallel read threshold; and
determine, by the processor, whether at least two blocks of the requested file have been cached in response to a determination that the remote block read request is sent in response to the foreground read request and invoke the parallel I/O code path with handling of sparse blocks and small read scenarios in response to a determination that at least two blocks of the requested file have been cached and a percentage of cached blocks of the requested file versus all blocks of the requested file exceeds a predetermined prefetch threshold.
1 Assignment
0 Petitions
Accused Products
Abstract
In one embodiment, a method includes determining a home node that corresponds to gateway (GW) nodes in a clustered file system, each GW node being eligible to process one or more read tasks, determining a peer GW eligibility value for more than one of the GW nodes in the clustered file system eligible to process one or more read tasks, and determining a single GW node from amongst the GW nodes having a highest peer GW eligibility value for each home node. Additionally, the method includes assigning and defining a size for one or more read task items for the GW nodes having the highest peer GW eligibility value for multiple home nodes based on a current dynamic profile of the GW nodes, and distributing workload to the GW nodes according to the size for each of the one or more read task items assigned to the GW nodes.
-
Citations
15 Claims
-
1. A computer program product, comprising a computer readable storage medium having program code embodied therewith, the embodied program code executable by a processor to cause the processor to:
-
determine, by the processor, a home node that corresponds to gateway (GW) nodes in a clustered file system, each GW node being eligible to process one or more read tasks; determine, by the processor, a peer GW eligibility value for more than one of the GW nodes in the clustered file system eligible to process the one or more read tasks; determine, by the processor, a single GW node from amongst the GW nodes having a highest peer GW eligibility value for each home node; assign and define, by the processor, a size for one or more read task items for the GW nodes having the highest peer GW eligibility value for multiple home nodes based on a current dynamic profile of the GW nodes; distribute, by the processor, workload to the GW nodes according to the size for each of the one or more read task items assigned to the GW nodes; receive, by the processor, a remote block read request for an uncached file from an application node; determine, by the processor, whether the remote block read request is sent in response to a foreground read request or initiation of a manual file prefetch; determine, by the processor, whether a size of the requested file exceeds a parallel read threshold in response to a determination that the remote block read request is sent in response to the manual file prefetch and invoke a parallel input/output (I/O) code path with handling of sparse blocks and small read scenarios in response to a determination that the size of the requested file exceeds the parallel read threshold; and determine, by the processor, whether at least two blocks of the requested file have been cached in response to a determination that the remote block read request is sent in response to the foreground read request and invoke the parallel I/O code path with handling of sparse blocks and small read scenarios in response to a determination that at least two blocks of the requested file have been cached and a percentage of cached blocks of the requested file versus all blocks of the requested file exceeds a predetermined prefetch threshold. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer-implemented method, comprising:
-
determining a home node that corresponds to gateway (GW) nodes in a clustered file system, each GW node being eligible to process one or more read tasks; determining a peer GW eligibility value for more than one of the GW nodes in the clustered file system eligible to process one or more read tasks; determining a single GW node from amongst the GW nodes having a highest peer GW eligibility value for each home node; assigning and defining a size for one or more read task items for the GW nodes having the highest peer GW eligibility value for multiple home nodes based on a current dynamic profile of the GW nodes; distributing workload to the GW nodes according to the size for each of the one or more read task items assigned to the GW nodes; receiving a task response from a slave GW node corresponding to the one or more read task items, the task response indicating at least a read start value and an end offset value of data corresponding to the one or more read task items; retrieving a cached block bitmap of the data corresponding to the one or more read task items, the cached block bitmap being an indication of which portions of the data corresponding to the one or more read task items are stored to a cache storage; determining whether, for each read request waiting for remote read corresponding to the one or more read task items, all blocks of the data corresponding to the one or more read task items are stored in the cache storage; sending a response to the slave GW node indicating that caching has been performed for the data corresponding to the one or more read task items in response to a determination that all blocks of the data corresponding to the one or more read task items are stored in the cache storage; and sending a request to cache portions of the data corresponding to the one or more read task items which are not currently in the cache storage in response to a determination that all blocks of the data corresponding to the one or more read task items are not stored in the cache storage. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A system, comprising a processor and logic integrated with the processor, executable by the processor, or integrated with and executable by the processor, the logic being configured to cause the processor to:
-
determine a home node that corresponds to gateway (GW) nodes in a clustered file system, each GW node being eligible to process one or more read tasks; determine a peer GW eligibility value for more than one of the GW nodes in the clustered file system eligible to process the one or more read tasks; determine a single GW node from amongst the GW nodes having a highest peer GW eligibility value for each home node; assign and define a size for one or more read task items for the GW nodes having the highest peer GW eligibility value for multiple home nodes based on a current dynamic profile of the GW nodes; distribute workload to the GW nodes according to the size for each of the one or more read task items assigned to the GW nodes; and receive a remote block read request for an uncached file from an application node; determine whether the remote block read request is sent in response to a foreground read request or initiation of a manual file prefetch; determine whether a size of the requested file exceeds a parallel read threshold in response to a determination that the remote block read request is sent in response to the manual file prefetch and invoke a parallel input/output (I/O) code path with handling of sparse blocks and small read scenarios in response to a determination that the size of the requested file exceeds the parallel read threshold; and determine whether at least two blocks of the requested file have been cached in response to a determination that the remote block read request is sent in response to the foreground read request and invoke the parallel I/O code path with handling of sparse blocks and small read scenarios in response to a determination that at least two blocks of the requested file have been cached and a percentage of cached blocks of the requested file versus all blocks of the requested file exceeds a predetermined prefetch threshold. - View Dependent Claims (12, 13, 14, 15)
-
Specification