×

Background format optimization for enhanced SQL-like queries in Hadoop

  • US 9,477,731 B2
  • Filed: 10/01/2013
  • Issued: 10/25/2016
  • Est. Priority Date: 10/01/2013
  • Status: Active Grant
First Claim
Patent Images

1. A system for performing queries on stored data in a Hadoop™

  • distributed computing cluster, the system comprising;

    a plurality of data nodes forming a peer-to-peer network for the queries received from a client, each data node of the plurality of data nodes functioning as a peer in the peer-to-peer network and being capable of interacting with components of the Hadoop™

    cluster, each peer having an instance of a query engine running in memory, each instance of the query engine having;

    a query planner configured to parse a query from the client and selectively creates query fragments based on an availability of converted data at the data node, the converted data corresponding to data associated with the query, wherein the converted data is the data associated with the query converted from an original format into a target format that is specified by a schema, and wherein the query is processed by whichever data node that receives the query;

    a query coordinator configured to distribute the query fragments among the plurality of data nodes; and

    a query execution engine comprising;

    a transformation module configured to transform whichever local data that corresponds to a format for which the query fragments are created into in-memory tuples based on the schema; and

    an execution module configured to execute the query fragments on the in-memory tuples to obtain intermediate results from other data nodes that receive the query fragments and to aggregate the intermediate results for the client.

View all claims
  • 5 Assignments
Timeline View
Assignment View
    ×
    ×