×

Performing data analytics utilizing a user configurable group of reusable modules

  • US 10,459,767 B2
  • Filed: 03/05/2014
  • Issued: 10/29/2019
  • Est. Priority Date: 03/05/2014
  • Status: Active Grant
First Claim
Patent Images

1. A system for performing analytics on a large quantity of data accommodated by an external mass storage device comprising:

  • a computer system including at least one processor configured to;

    divide the analytics into a plurality of analytic modules, wherein each of the analytic modules is selectively executed and comprises a script for a parallel processing engine to perform a corresponding atomic operation of the analytics, the plurality of analytic modules including one or more pre-processing modules, one or more statistical analytic modules and one or more post-processing modules;

    receive an input from a user, the input including a user selection of one or more of the plurality of analytic modules to perform desired analytics on the large quantity of data from the external mass storage device;

    responsive to the receiving the input including the user selection, automatically generate a master script designating the one or more of the plurality of analytic modules that are to be present in a module chain and an order of performing the designated one or more of the plurality analytic modules in the module chain, one or more pre-processing modules of the one or more of the plurality of analytic modules to be executed before one or more statistical analytic modules of the one or more of the plurality of analytic modules, and the one or more statistical analytic modules of the one or more of the plurality of analytic modules to be executed before one or more post-processing modules of the one or more of the plurality of analytic modules;

    execute pre-processing scripts associated with the one or more pre-processing modules of the one or more of the plurality of analytic modules in the module chain to produce one or more partial solutions, the one or more pre-processing modules of the one or more of the plurality of analytic modules preparing and cleaning raw data to produce the one or more partial solutions to be provided to the one or more statistical analytic modules in the module chain;

    accept one of the one or more partial solutions and automatically break down scripts associated with the one or more statistical modules of the one or more of the plurality of analytic modules in the module chain into map/reduce jobs and optimize execution of the map/reduce jobs;

    execute the map/reduce jobs;

    andautomatically execute alternative statistical modules, based on scoring results of the one or more post-processing modules of the one or more of the plurality of analytic modules, the automatically executing reusing, as input, a partial solution of the one or more partial solutions produced by completing execution of at least one of the one or more pre-processing modules to avoid re-execution of the at least one of the one or more pre-processing modules.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×