Data locality in data integration applications
First Claim
Patent Images
1. A computer-implemented method comprising:
- accessing a configuration file;
identifying a logical node, said logical node being associated with one or more source stages;
identifying one or more file block components, said one or more file block components comprising a retrieval target for at least one of said one or more source stages and being stored on a distributed file system;
identifying one or more physical nodes;
determining, for each of said one or more physical nodes, a degree value;
identifying one or more qualified physical nodes from said one or more physical nodes having said degree value of one or more;
creating a preferred physical node table, said preferred physical node table comprising, for each of said one or more qualified physical nodes, an identifying indication and an indication of said degree value;
sorting said preferred physical node table based on said degree value associated with each of said one or more qualified physical nodes;
determining a candidate preferred physical node based on each said degree value;
determining whether said candidate preferred physical node is available for allocation to said logical node;
responsive to said candidate preferred physical node being available for allocation to said logical node, allocating said candidate preferred physical node to said logical node; and
responsive to said candidate preferred physical node not being available for allocation to said logical node;
marking said candidate preferred physical node as unavailable for allocation to said logical node; and
determining an alternative candidate preferred physical node based on each said degree value.
1 Assignment
0 Petitions
Accused Products
Abstract
A computer-implemented method includes identifying a logical node. The logical node is associated with one or more source stages. The computer-implemented method further includes identifying one or more file block components. The one or more file block components include a retrieval target for at least one of the one or more source stages. The computer-implemented method further includes identifying one or more physical nodes and determining, for each of the one or more physical nodes, a degree value. A corresponding computer program product and computer system are also disclosed.
-
Citations
1 Claim
-
1. A computer-implemented method comprising:
-
accessing a configuration file; identifying a logical node, said logical node being associated with one or more source stages; identifying one or more file block components, said one or more file block components comprising a retrieval target for at least one of said one or more source stages and being stored on a distributed file system; identifying one or more physical nodes; determining, for each of said one or more physical nodes, a degree value; identifying one or more qualified physical nodes from said one or more physical nodes having said degree value of one or more; creating a preferred physical node table, said preferred physical node table comprising, for each of said one or more qualified physical nodes, an identifying indication and an indication of said degree value; sorting said preferred physical node table based on said degree value associated with each of said one or more qualified physical nodes; determining a candidate preferred physical node based on each said degree value; determining whether said candidate preferred physical node is available for allocation to said logical node; responsive to said candidate preferred physical node being available for allocation to said logical node, allocating said candidate preferred physical node to said logical node; and responsive to said candidate preferred physical node not being available for allocation to said logical node; marking said candidate preferred physical node as unavailable for allocation to said logical node; and determining an alternative candidate preferred physical node based on each said degree value.
-
Specification