×

Low latency query engine for apache hadoop

  • US 9,990,399 B2
  • Filed: 05/13/2016
  • Issued: 06/05/2018
  • Est. Priority Date: 03/13/2013
  • Status: Active Grant
First Claim
Patent Images

1. A system for performing queries in a HADOOP™

  • distributed computing cluster having a plurality of data nodes storing data, the data nodes having processing circuitry, the system comprising;

    a plurality of data nodes forming a peer network, a respective data node functioning as a peer in the peer network and being capable of interacting with components of the HADOOP™

    cluster, the respective data node operating an instance of a low latency query engine that queries data directly from the respective data node, the query engine is configured to;

    receive a query from a client;

    obtain, from the components of the HADOOP™

    cluster, location information regarding where one or more data blocks relevant to the received query are distributed among nodes in the cluster;

    create query fragments based on the obtained location information;

    construct a query plan based on the obtained location information;

    distribute the query fragments among the nodes in the cluster according to the query plan;

    receive intermediate results from the nodes in the cluster that receive the query fragments; and

    aggregate the intermediate results for the client.

View all claims
  • 5 Assignments
Timeline View
Assignment View
    ×
    ×