Scalable method for optimizing information pathway
First Claim
Patent Images
1. An apparatus comprising:
- a receiving module configured to receive a request for task execution at a central processing node for worldwide data;
wherein the central processing node is connected to sub-processing network nodes;
wherein the sub-processing network nodes are grouped into clusters;
wherein each cluster has a distributed file system mapping out network nodes for each respective cluster;
wherein each cluster stores a subset of the worldwide data; and
wherein each cluster is enabled to use the network nodes of the cluster to perform parallel processing;
wherein the central processing node is communicatively coupled to a global distributed file system that maps over each of the cluster'"'"'s distributed file systems to enable orchestration between the clusters;
a dividing module configured to divide by a worldwide job tracker the request for task execution into worldwide task trackers to be distributed to sub-processing network nodes of the clusters;
wherein the network sub-nodes manages a portion of the worldwide data for each respective cluster;
wherein each worldwide task tracker maintains records of sub-activities executed as part of the worldwide job;
a transmitting module configured to transmit to each of the sub-processing network nodes for each respective cluster the respective portion of the divided task execution by assigning each worldwide task tracker corresponding to the respective portion to the respective each cluster; and
a leveraging module configured to generate a graph layout of data pathways, the pathways calculated based upon physical distance between the processing nodes and bandwidth constraints, the leveraging module further configured to distribute task execution based upon the processing power of the processing nodes, graph layout, and the size of data processed by the sub-processing network nodes to reduce data movement between the central processing node and the sub-processing nodes.
9 Assignments
0 Petitions
Accused Products
Abstract
A method, system and process for receiving a request for task execution at a central processing node for the world wide data; wherein the central processing node is connected to sub-processing nodes, dividing the request for task execution to be distributed to a set of the sub-processing nodes; wherein the set of processing nodes manages a portion of the world wide data, transmitting to each of the set of the sub-processing nodes the respective portion of the divided task execution.
-
Citations
20 Claims
-
1. An apparatus comprising:
-
a receiving module configured to receive a request for task execution at a central processing node for worldwide data;
wherein the central processing node is connected to sub-processing network nodes;
wherein the sub-processing network nodes are grouped into clusters;
wherein each cluster has a distributed file system mapping out network nodes for each respective cluster;
wherein each cluster stores a subset of the worldwide data; and
wherein each cluster is enabled to use the network nodes of the cluster to perform parallel processing;
wherein the central processing node is communicatively coupled to a global distributed file system that maps over each of the cluster'"'"'s distributed file systems to enable orchestration between the clusters;a dividing module configured to divide by a worldwide job tracker the request for task execution into worldwide task trackers to be distributed to sub-processing network nodes of the clusters;
wherein the network sub-nodes manages a portion of the worldwide data for each respective cluster;
wherein each worldwide task tracker maintains records of sub-activities executed as part of the worldwide job;a transmitting module configured to transmit to each of the sub-processing network nodes for each respective cluster the respective portion of the divided task execution by assigning each worldwide task tracker corresponding to the respective portion to the respective each cluster; and a leveraging module configured to generate a graph layout of data pathways, the pathways calculated based upon physical distance between the processing nodes and bandwidth constraints, the leveraging module further configured to distribute task execution based upon the processing power of the processing nodes, graph layout, and the size of data processed by the sub-processing network nodes to reduce data movement between the central processing node and the sub-processing nodes. - View Dependent Claims (2, 3, 4)
-
-
5. A computer program product comprising:
-
a non-transitory computer readable medium encoded with computer executable program code, the code configured to enable the execution of; receiving a request for task execution at a central processing node for worldwide data;
wherein the central processing node is connected to sub-processing network nodes;
wherein the sub-processing network nodes are grouped into clusters;
wherein each cluster has a distributed file system mapping out network nodes for each respective cluster;
wherein each cluster stores a subset of the worldwide data; and
wherein each cluster is enabled to use the network nodes of the cluster to perform parallel processing;
wherein the central processing node is communicatively coupled to a global distributed file system that maps over each of the cluster'"'"'s distributed file systems to enable orchestration between the clusters;dividing by a worldwide job tracker the request for task execution into worldwide task trackers to be distributed sub-processing network nodes of the clusters;
wherein the sub- processing network nodes manages a portion of the worldwide data for each respective cluster;
wherein each worldwide task tracker maintains records of sub-activities executed as part of the worldwide job;transmitting to each of the sub-processing network nodes for each respective cluster the respective portion of the divided task execution by assigning each worldwide task tracker corresponding to the respective portion to the respective each cluster; generating a graph layout of data pathways, the pathways calculated based upon physical distance between the processing nodes and bandwidth constraints; and distributing task execution among processing nodes based upon the graph layout, the processing power of the processing nodes, and the size of data processed by the sub-processing network nodes to reduce the amount and duration of data movement between the central processing node and the sub-processing nodes. - View Dependent Claims (6, 7, 8, 9, 10, 11)
-
-
12. A computer method for processing execution of a task in a worldwide data environment comprising:
-
receiving a request for task execution at a central processing node for worldwide data;
wherein the central processing node is connected to sub-processing network nodes;
wherein the sub-processing network nodes are grouped into clusters;
wherein each cluster has a distributed file system mapping out network nodes for each respective cluster;
wherein each cluster stores a subset of the worldwide data; and
wherein each cluster is enabled to use the network nodes of the cluster to perform parallel processing;
wherein the central processing node is communicatively coupled to a global distributed file system that maps over each of the cluster'"'"'s distributed file systems to enable orchestration between the clusters;dividing by a worldwide job tracker the request for task execution to be distributed to the sub-processing network nodes of the clusters;
wherein the processing network nodes manages a portion of the worldwide data for each respective cluster;
wherein each worldwide task tracker maintains records of sub-activities executed as part of the worldwide job;transmitting to each of the sub-processing network nodes for each respective cluster the respective portion of the divided task execution by assigning each worldwide task tracker corresponding to the respective portion to the respective each cluster; generating a graph layout of data pathways, the pathways calculated based upon physical distance between the processing nodes and bandwidth constraints; and distributing task execution based upon the graph layout, the processing power of the processing nodes, proximity between the central processing node and the sub-processing nodes, and the size of data processed by the sub-processing network nodes to reduce the amount and duration of data movement between the central processing node and the sub-processing nodes. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20)
-
Specification