×

Parallel streaming of external data

  • US 9,684,666 B1
  • Filed: 02/28/2014
  • Issued: 06/20/2017
  • Est. Priority Date: 02/28/2014
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method comprising:

  • receiving, by a first distributed system that comprises multiple segment nodes, a query that requests rows of an external table representing data stored in multiple data fragments on multiple respective data nodes in a second distributed system, wherein each of the data nodes of the second distributed system are distinct from the segment nodes of the first distributed system, wherein the first distributed system operates under control of a first master node and the second distributed system operates under control of a second master node that is distinct from the first master node, and wherein the query includes a predicate that specifies a condition on an attribute of the requested rows of the external table;

    providing the predicate to a plurality of segment nodes of the first distributed system;

    initiating communication by a plurality of extension services between nodes of the first distributed system and nodes of the second distributed system, including a first extension service communicating with the master nodes of the first and second distributed system and a plurality of second extension services communicating between data nodes of the second distributed system and segment nodes of the first distributed system;

    receiving, by each of the plurality of segment nodes of the first distributed system, filtered data corresponding to the rows of the external table, wherein each segment node of the multiple segment nodes of the first distributed system provides the predicate to a second extension service that is between the segment node of the first distributed system and a data node of the second distributed system, wherein the second extension service;

    obtains data fragments from one or more data nodes of the second distributed system,determines whether each data fragment has an attribute that satisfies the predicate, andprovides, to the segment node of the first distributed system, filtered data comprising the data fragments from the data node of the second distributed system having the attribute satisfied by the predicate; and

    computing a result for the received query using the filtered data corresponding to the rows of the external table.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×