×

System and method for dynamic database split generation in a massively parallel or distributed database environment

  • US 10,394,818 B2
  • Filed: 09/24/2015
  • Issued: 08/27/2019
  • Est. Priority Date: 09/26/2014
  • Status: Active Grant
First Claim
Patent Images

1. A method for dynamic database split generation in a massively parallel or other distributed database environment including a plurality of databases and a data warehouse layer providing querying of the plurality of databases and data summarization of the plurality of databases in a table, the method comprising:

  • obtaining by a database table accessor executing on one or more microprocessors, from an associated client application, a query for data in the table of the data warehouse layer, the query comprising query data representative of a user query and user splitter kind preference data representative of a user split preference specifying how an associated user would prefer the table to be split for performing the query for data;

    obtaining table data representative of one or more properties of the table, the table data comprising table size data representative of a total size of the table;

    selecting a splits generator from among an enumeration of splitter kinds in accordance with;

    the user split preference when it is determined by the database table accessor that splitting the table using the user split preference would improve a performance of the query for data relative to splitting the table based on the one or more properties of the table, orthe one or more properties of the table when it is determined by the database table accessor that splitting the table based on the one or more properties of the table would improve the performance of the query for data relative to splitting the table based on the user split preference;

    generating, by the selected splits generator, table splits dividing the user query into a plurality of query splits; and

    outputting the plurality of query splits to an associated plurality of mappers for execution by the associated plurality of mappers of each of the plurality of query splits as tasks of a selected data processing framework against the table.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×