Management of intermediate data spills during the shuffle phase of a map-reduce job

US 9,424,274 B2
Filed: 06/03/2013
Issued: 08/23/2016
Est. Priority Date: 06/03/2013
Status: Active Grant

First Claim

Patent Images

1. A distributed computer system configured for spill management during a shuffle phase of a map-reduce job performed by said distributed computer system on distributed files, said distributed computer system comprising:

a) key-value pairs (ki,vi) belonging to said distributed files on which said map-reduce job is performed;

b) a number of map nodes for performing a pre-shuffle phase of said map-reduce job on said key value pairs (ki,vi) to generate keyed partitions (Ki,PRTj);

c) storage resources for spilling said keyed partitions (Ki,PRTj) in accordance with a spilling protocol based on at least one popularity attribute of said key-value pairs (ki,vi);

d) a number of reduce nodes provided with said spilling protocol to enable said reduce nodes to locate and access said keyed partitions (Ki,PRTj) during said shuffle phase by utilizing a path to said keyed partitions (Ki,PRTj), said path sent in the header of an empty HTTP message;

e) said keyed partitions (Ki,PRTj) stored in a shared directory under a mount point, said shared directory accessible by said map nodes and said reduce nodes;

whereinsaid distributed computer system executes a post-shuffle phase of said map-reduce job to produce an output list of said map-reduce job.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and a method for spill management during the shuffle phase of a map-reduce job performed in a distributed computer system on distributed files. A spilling protocol is provided for handling the spilling of intermediate data based on at least one popularity attribute of key-value pairs of the input data on which the map-reduce job is performed. The spilling protocol includes an assignment order to storage resources belonging to the computer system based on the at least one popularity attribute. The protocol can be deployed in computer systems with heterogeneous storage resources. Additionally, pointers or tags can be assigned to improve shuffle phase performance. The distributed file systems that are most suitable are ones usable by Hadoop, e.g., Hadoop Distributed File System (HDFS).

Citations

20 Claims

1. A distributed computer system configured for spill management during a shuffle phase of a map-reduce job performed by said distributed computer system on distributed files, said distributed computer system comprising:
- a) key-value pairs (ki,vi) belonging to said distributed files on which said map-reduce job is performed;
  
  b) a number of map nodes for performing a pre-shuffle phase of said map-reduce job on said key value pairs (ki,vi) to generate keyed partitions (Ki,PRTj);
  
  c) storage resources for spilling said keyed partitions (Ki,PRTj) in accordance with a spilling protocol based on at least one popularity attribute of said key-value pairs (ki,vi);
  
  d) a number of reduce nodes provided with said spilling protocol to enable said reduce nodes to locate and access said keyed partitions (Ki,PRTj) during said shuffle phase by utilizing a path to said keyed partitions (Ki,PRTj), said path sent in the header of an empty HTTP message;
  
  e) said keyed partitions (Ki,PRTj) stored in a shared directory under a mount point, said shared directory accessible by said map nodes and said reduce nodes;
  
  whereinsaid distributed computer system executes a post-shuffle phase of said map-reduce job to produce an output list of said map-reduce job.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The distributed computer system of claim 1, wherein said storage resources comprise heterogeneous storage resources recognized as block storage devices by said distributed computer system.
  - 3. The distributed computer system of claim 2, wherein said heterogeneous storage resources include at least two members of the group consisting of SATA, HDD, RAID, SSD, Optical drives, Cloud, tape and general block storage devices.
  - 4. The distributed computer system of claim 2, further comprising at least one tag assigning a logic unit number (LUN) of said keyed partitions (Ki,PRTj) in said block storage devices to keyed partitions (Ki,PRTj) related to most popular key-value pairs (ki,vi).
  - 5. The distributed computer system of claim 1, wherein said distributed file system comprises a distributed file system usable by Hadoop.
  - 6. The distributed computer system of claim 1, further comprising a fast connection between at least some of said storage resources and said reduce nodes.
  - 7. The distributed computer system of claim 6, wherein said fast connection is to said storage resources and said storage resources comprise block storage devices.

8. A method for spill management during a shuffle phase of a map-reduce job that is performed in a distributed computer system on distributed files, said method comprising:
- a) identifying as key-value pairs (ki,vi) input data associated with said map-reduce job;
  
  b) performing a pre-shuffle phase of said map-reduce job on said input data on a number of map nodes of said distributed computer system to generate intermediate data;
  
  c) providing a spilling protocol for said intermediate data based on at least one popularity attribute of said key-value pairs (ki,vi);
  
  d) spilling said intermediate data over storage resources of said distributed computer system in accordance with said spilling protocol by storing said intermediate data in a shared directory under a mount point, said shared directory accessible by said map nodes and a number of reduce nodes;
  
  e) providing a task tracker of said map-reduce job to send a Fully Qualified Domain Name (FQDN) path to said intermediate data in a header of an HTTP message;
  
  f) providing said reduce nodes with said spilling protocol and accessing said intermediate data utilizing said FQDN path during said shuffle phase; and
  
  g) performing a post-shuffle phase of said map-reduce job to produce an output list of said map-reduce job.
- View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
- - 9. The method of claim 8, wherein said spilling protocol comprises an assignment order to said storage resources based on said at least one popularity attribute.
  - 10. The method of claim 8, wherein said popularity attribute is assigned by a search-ranking algorithm.
  - 11. The method of claim 10, wherein said key-value pairs (ki,vi) with the highest search ranking assigned by said search-ranking algorithm are spilled to fastest storage resources among said storage resources of said distributed computer system.
  - 12. The method of claim 8, wherein said storage resources comprise heterogeneous storage resources recognized as block storage devices by said distributed computer system.
  - 13. The method of claim 12, wherein said heterogeneous storage resources include at least two members of the group consisting of SATA, HDD, RAID, SSD, Optical drives, Cloud, tape and general block storage devices.
  - 14. The method of claim 12, wherein said spilling protocol assigns intermediate data related to most popular key-value pairs (ki,vi) to said block storage devices and assigns a tag comprising a logic unit number (LUN) of said intermediate data in said block storage devices.
  - 15. The method of claim 8, wherein said distributed file system comprises a distributed file system usable by Hadoop.
  - 16. The method of claim 8, wherein said task tracker assigns tags comprising logic unit numbers (LUN) in said storage resources of intermediate data related to most popular key-value pairs (ki,vi).
  - 17. The method of claim 8, wherein a fast connection is provided between at least some of said storage resources and said reduce nodes.
  - 18. The method of claim 17, wherein said fast connection is to said storage resources and said storage resources comprise block storage devices.

19. A method for spill management during a shuffle phase of a map-reduce job that is performed in a distributed computer system on distributed files, said method comprising:
- a) identifying as key-value pairs (ki,vi) input data associated with said map-reduce job;
  
  b) performing a pre-shuffle phase of said map-reduce job on said input data on a number of map nodes of said distributed computer system to generate intermediate data;
  
  c) providing a spilling protocol for said intermediate data for assigning at least one popularity attribute of said key-value pairs (ki,vi);
  
  d) spilling said intermediate data over storage resources of said distributed computer system in accordance with said spilling protocol by storing said intermediate data in a shared directory under a mount point, said shared directory accessible by said map nodes and a plurality of reduce nodes;
  
  e) providing a task tracker of said map-reduce job to send a path to said intermediate data in a header of an empty HTTP message;
  
  f) locating and accessing said intermediate data for said reduce nodes by utilizing said path during said shuffle phase; and
  
  g) performing a post-shuffle phase of said map-reduce job to produce an output list of said map-reduce job.
- View Dependent Claims (20)
- - 20. The method of claim 19, wherein said HTTP message further comprises custom fields selected from the group consisting of Raw-Map-Output-Length, Map-Output-Length and for-reduce-task.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Zettaset Incorporated
Original Assignee
Zettaset Incorporated
Inventors
Cramer, Michael J., Christian, Brian P.
Primary Examiner(s)
Trujillo, James
Assistant Examiner(s)
TESSEMA, AIDA Z

Application Number

US13/908,953
Publication Number

US 20140358977A1
Time in Patent Office

1,177 Days
Field of Search

None
US Class Current

1/1
CPC Class Codes

G06F 16/183   Provision of network file s...

G06F 16/2386   Bulk updating operations da...

G06F 16/24578   using ranking

G06F 9/5061   Partitioning or combining o...

Management of intermediate data spills during the shuffle phase of a map-reduce job

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Management of intermediate data spills during the shuffle phase of a map-reduce job

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links