Concurrent data processing in a distributed system

US 8,266,289 B2
Filed: 04/23/2009
Issued: 09/11/2012
Est. Priority Date: 04/23/2009
Status: Active Grant

First Claim

Patent Images

1. One or more computer storage device having computer-useable instructions embodied thereon for performing a method for scheduling vertices in a cluster, the method comprising:

receiving a data job;

dividing the data job into a plurality of vertices;

assigning the plurality of vertices to one or more process nodes that comprise the cluster;

receiving resource usage information for one or more vertices, wherein the vertices have run to completion, and wherein resource usage information for the vertices has been determined; and

for each of the plurality of vertices for which resource usage information has not been received;

estimating resource usage of the vertex from the received resource usage information for the completed vertices, wherein estimating resource usage comprises;

(A) estimating an input data size range;

(B) dividing the input data size range into data size buckets,wherein the data size buckets are subsets of the data size range;

(C) storing resource usage information for each completed vertex in the corresponding data size bucket; and

(D) for each data size bucket, calculating estimated resource usage information for uncompleted vertices with an input data size within the data size bucket'"'"'s range; and

transmitting the estimated resource usage of the vertex to the process node in the cluster to which the vertex is assigned, wherein the process node allocates computing resources to the vertex based on the estimated resource usage.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems, methods, and computer media for scheduling vertices in a distributed data processing network and allocating computing resources on a processing node in a distributed data processing network are provided. Vertices, subparts of a data job including both data and computer code that runs on the data, are assigned by a job manager to a distributed cluster of process nodes for processing. The process nodes run the vertices and transmit computing resource usage information, including memory and processing core usage, back to the job manager. The job manager uses this information to estimate computing resource usage information for other vertices in the data job that are either still running or waiting to be run. Using the estimated computing resource usage information, each process node can run multiple vertices concurrently.

59 Citations

View as Search Results

18 Claims

1. One or more computer storage device having computer-useable instructions embodied thereon for performing a method for scheduling vertices in a cluster, the method comprising:
- receiving a data job;
  
  dividing the data job into a plurality of vertices;
  
  assigning the plurality of vertices to one or more process nodes that comprise the cluster;
  
  receiving resource usage information for one or more vertices, wherein the vertices have run to completion, and wherein resource usage information for the vertices has been determined; and
  
  for each of the plurality of vertices for which resource usage information has not been received;
  
  estimating resource usage of the vertex from the received resource usage information for the completed vertices, wherein estimating resource usage comprises;
  
  (A) estimating an input data size range;
  
  (B) dividing the input data size range into data size buckets,wherein the data size buckets are subsets of the data size range;
  
  (C) storing resource usage information for each completed vertex in the corresponding data size bucket; and
  
  (D) for each data size bucket, calculating estimated resource usage information for uncompleted vertices with an input data size within the data size bucket'"'"'s range; and
  
  transmitting the estimated resource usage of the vertex to the process node in the cluster to which the vertex is assigned, wherein the process node allocates computing resources to the vertex based on the estimated resource usage.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The device of claim 1, wherein resource usage information comprises at least one of memory usage, processor core usage, disk usage, and network usage.
  - 3. The device of claim 1, wherein assigning the plurality of vertices to one or more process nodes that comprise the cluster further comprises assigning two or more vertices to the same process node.
  - 4. The device of claim 1, further comprising:
    - receiving resource usage information for additional completed vertices, wherein the additional completed vertices have run to completion, and wherein resource usage information for the additional completed vertices has been determined; and
      
      transmitting updated estimated resource usage for the vertices that have not completed to the process nodes to which the vertices that have not completed are assigned.
  - 5. The device of claim 1, wherein estimating resource usage further comprises creating estimate values based on a relationship between the size of the data processed by completed vertices and the resource usage information of the completed vertices.
  - 6. The media of claim 1, wherein estimating an input data size range comprises:
    - multiplying the size of the data processed by the first vertex to complete by one half to create a lower bound; and
      
      multiplying the size of the data processed by the first vertex to complete by two to create an upper bound.

7. A method for allocating computing resources to vertices on a process node, the method comprising:
- on a process node, receiving information about a plurality of vertices assigned to the process node;
  
  receiving estimated resource usage information for at least one of the plurality of vertices assigned to the process node;
  
  allocating computing resources to a first assigned vertex;
  
  running the first assigned vertex;
  
  allocating computing resources to a second assigned vertex based on the received estimated resource usage information;
  
  running the second assigned vertex concurrently with the first assigned vertex;
  
  allocating computing resources to additional assigned vertices until either all computing resources have been allocated or computing resources are allocated to all additional assigned vertices;
  
  reserving an amount of memory for running vertices;
  
  reserving an amount of memory for running processes that are not vertices; and
  
  reserving an amount of free memory, wherein free memory is not available to be allocated to assigned vertices or available for running processes that are not vertices.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The method of claim 7, further comprising:
    - upon determining that currently running vertices are exceeding available computing resources, terminating one or more running vertices that are using more computing resources than the vertices were allocated; and
      
      upon determining that after terminating one or more running vertices using more computing resources than the vertices were allocated that the currently running vertices continue to exceed available computing resources, terminating the most recent vertex to begin running.
  - 9. The method of claim 7, wherein computing resources are allocated to the additional assigned vertices according to the position of the additional assigned vertices in a queue.
  - 10. The method of claim 9, wherein if the next vertex in the queue awaiting allocation of computing resources requires computing resources in excess of the available computing resources, the next vertex is flagged and is not allocated computing resources.
  - 11. The method of claim 10, further comprising:
    - upon determining that the flagged vertex will begin to run no later than the sum of the time the currently running vertices will take to complete and an acceptable delay for the flagged vertex, allocating resources to and running an assigned vertex in a queue position behind the flagged vertex,wherein the acceptable delay for the flagged vertex is caused by running the assigned vertex in a queue position behind the flagged vertex before running the flagged vertex; and
      
      upon determining that the flagged vertex will not begin to run before the sum of the time the currently running vertices will take to complete and an acceptable delay, waiting until the currently running vertices have completed, allocating resources to the flagged vertex, and running the flagged vertex.
  - 12. The method of claim 11, wherein the acceptable delay for the flagged vertex is the greater of a specified percentage of the flagged vertex'"'"'s expected run time or a specified minimum delay.

13. A vertex assignment scheduling system for scheduling vertices in a cluster, the system comprising:
- a computing device associated with a job manager having one or more process nodes and one or more computer-readable storage media; and
  
  a data store coupled with the job manager,wherein the job manager;
  
  assigns a plurality of vertices to the one or more process nodes that comprise a cluster,receives resource usage information for assigned vertices that have completed, andcalculates, based on the received resource usage information for the completed vertices, estimated resource usage for the assigned vertices for which resource usage information has not been received;
  
  and wherein at least one of the one or more process nodes;
  
  transmits resource usage information to the job manager for each vertex assigned to the process node that completes,allocates computing resources based upon estimated resource usage transmitted to the process node by the job manager such that multiple vertices run concurrently on the process node, andallocates computing resources according to the position of assigned vertices in a queue on the process node, and wherein if the next vertex awaiting allocation of computing resources in the queue requires computing resources in excess of the available computing resources of the process node, the next vertex is flagged and is not allocated computing resources by the process node.
- View Dependent Claims (14, 15, 16, 17, 18)
- - 14. The system of claim 13, wherein the job manager updates the estimated resource usage of the assigned vertices based on received resource usage information for additional assigned vertices that have completed.
  - 15. The system of claim 13, wherein estimating resource usage further comprises:
    - estimating an input data size range;
      
      dividing the input data size range into data size buckets, wherein the data size buckets are subsets of the data size range;
      
      storing resource usage information for each completed vertex in the corresponding data size bucket; and
      
      for each data size bucket, calculating estimated resource usage information for uncompleted vertices with an input data size within the data size bucket'"'"'s range.
  - 16. The system of claim 15, wherein estimating an input data size range comprises:
    - multiplying the size of the data processed by the first vertex to complete by one half to create a lower bound; and
      
      multiplying the size of the data processed by the first vertex to complete by two to create an upper bound.
  - 17. The system of claim 13, wherein:
    - upon determining that the flagged vertex will begin to run no later than the sum of the time the currently running vertices will take to complete and an acceptable delay for the flagged vertex, running an assigned vertex in a queue position behind the flagged vertex, andwherein the acceptable delay for the flagged vertex is caused by running the assigned vertex in a queue position behind the flagged vertex before running the flagged vertex.
  - 18. The system of claim 17, wherein the acceptable delay for the flagged vertex is the greater of a specified percentage of the flagged vertex'"'"'s expected run time or a specified minimum delay.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Saha, Bikas, Chaiken, Ronnie, Ryseff, James David
Primary Examiner(s)
Boutah, Alina N.

Application Number

US12/428,964
Publication Number

US 20100275212A1
Time in Patent Office

1,237 Days
Field of Search

709/226, 700/99, 704/E15.048, 705/7.12, 718/104
US Class Current

709/226
CPC Class Codes

G06F 2209/5014   Reservation

G06F 2209/5017   Task decomposition

G06F 9/5027   the resource being a machin...

Concurrent data processing in a distributed system

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

59 Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Concurrent data processing in a distributed system

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

59 Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links