×

PREVIEW DATA AGGREGATION

  • US 20180336230A1
  • Filed: 05/16/2017
  • Published: 11/22/2018
  • Est. Priority Date: 05/16/2017
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented method, comprising:

  • processing, at a first worker node, a first data chunk of a dataset to generate a first intermediate result, the processing of the first data chunk comprising inserting key-value pairs from the first data chunk into the first intermediate result, the dataset being partitioned into the first data chunk and a second data chunk;

    generating, at a merger node, a key map based at least on a determination that a quantity of the key-value pairs in the first intermediate result exceeds a threshold value, the key map being generated to include one or more keys of the key-value pairs in the first intermediate result;

    processing, at a second worker node, the second data chunk to generate a second intermediate result, the second data chunk being processed based at least on the key map, the processing of the second data chunk comprising omitting a key-value pair in the second data chunk from being inserted into the second intermediate result, the key-value pair being omitted based on a key associated with the key-value pair being absent from the key map; and

    generating a preview of the processing of the dataset, the preview being generated based at least on the first intermediate result and the second intermediate result.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×