×

PROCESSING DATA FROM MULTIPLE SOURCES

  • US 20170220646A1
  • Filed: 02/14/2017
  • Published: 08/03/2017
  • Est. Priority Date: 04/17/2014
  • Status: Active Grant
First Claim
Patent Images

1. A method including:

  • at a node of a Hadoop cluster, the node storing a first portion of data in HDFS data storage;

    executing a first instance of a data processing engine capable of receiving data from a data source external to the Hadoop cluster;

    receiving a dataflow graph by the data processing engine, the dataflow graph including a) at least one component representing the Hadoop cluster, b) at least one component representing the data source external to the Hadoop cluster, and c) at least one link that represents at least one dataflow associated with a data processing operation;

    executing at least part of the dataflow graph by the first instance of the data processing engine;

    receiving, by the data processing engine, a second portion of data from the external data source;

    andperforming, by the data processing engine, the data processing operation using at least the first portion of data and the second portion of data.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×