×

System and Method for Large-Scale Data Processing Using an Application-Independent Framework

  • US 20100122065A1
  • Filed: 01/12/2010
  • Published: 05/13/2010
  • Est. Priority Date: 06/18/2004
  • Status: Active Grant
First Claim
Patent Images

1. A system for large-scale processing of data in a distributed and parallel processing environment including a set of interconnected computing systems, comprising:

  • an application-independent framework for processing data, including;

    a plurality of application-independent map modules; and

    ,a plurality of application-independent reduce modules, wherein the application-independent map modules and application-independent reduce modules use application-independent operators to automatically handle parallelization of computations across the distributed and parallel processing environment when performing user-specified data processing operations; and

    a plurality of user-specified, application-specific operators, for use with the application-independent framework to perform a user-specified data processing operation on a user-specified set of input files, the application-specific operators including;

    a map operator, wherein the map operator is applied by the application-independent map modules to input data in the user-specified set of input files to produce intermediate data values; and

    a reduce operator, wherein the reduce operator is applied by the application-independent reduce modules to process the intermediate data values to produce final output data for the user-specified data processing operation.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×