×

Method and system for optimizing reduce-side join operation in a map-reduce framework

  • US 10,185,743 B2
  • Filed: 11/25/2014
  • Issued: 01/22/2019
  • Est. Priority Date: 11/26/2013
  • Status: Active Grant
First Claim
Patent Images

1. A computer system for optimizing reduce-side join operation in a Map-reduce framework between a first data structure and a second data structure, the first data structure being sorted and divided into one or more regions, the system comprising:

  • one or more processors; and

    a non-transitory memory that includes modules that are executable by said one or more processors, wherein the modules include;

    an executing module to execute one or more map operations by one or more processors, wherein to execute one or more map operations by one or more processors comprises to;

    fetch input data of the second data structure;

    partition the data of the second data structure according to key-value pair;

    project the key-value pairs of the second data structure to a partitioner;

    maintain one or more region key counters;

    wherein the region key counter being used for registering key count value of one or more regions of the second data structure; and

    emit the key count value of one or more regions and corresponding data, wherein the key count values are emitted prior to the corresponding data;

    a grouping module to group mapped data corresponding to a single region of the second data structure;

    an accumulating module to provide the grouped data to a reducer;

    a fetching module to retrieve descriptive metadata of one or more regions of the first data structure; and

    a selecting module to select one of a look-up approach and a scan approach to perform the join operation by one or more reducers based on associated key count value and predefined criteria by the reducer, to perform the join operation.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×