×

Multisource semantic partitioning

  • US 11,157,473 B2
  • Filed: 11/21/2014
  • Issued: 10/26/2021
  • Est. Priority Date: 11/21/2014
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for federated query processing comprising:

  • receiving one or more source queries associated with a data set;

    storing the one or more source queries as one or more historical queries;

    storing at least one statistic for each of the one or more historical queries, the at least one statistic including a size for each of the one or more historical queries;

    determining one or more column constant pairs associated with the one or more historical queries, each column constant pair identifying a column and a corresponding value;

    based on the one or more column constant pairs, determining a partitioning column constant pair,wherein the one or more column constant pairs corresponding to a first subset of the one or more historical queries have a first pre-defined relation to the partitioning column constant pair, the first pre-defined relation having a size difference between +10% or −

    10%,wherein the one or more column constant pairs corresponding to a second subset of the one or more historical queries have a second pre-defined relation to the partitioning column constant pair, the second pre-defined relation having a size difference of less than 10%, andwherein the first subset of the one or more historical queries is within a pre-determined size corresponding to the second subset of the one or more historical queries;

    based on the determined partitioning column constant pair, partitioning the data set into a first subset of the data set and a second subset of the data set;

    storing, in a data store, associations between the determined partitioning column constant pair and both of the first subset of the data set and the second subset of the data set;

    after the partitioning, determining a source column constant pair associated with a received source query;

    comparing the source column constant pair to the partitioning column constant pair;

    based on the comparing, generating a result corresponding to the received source query from at least one of the following;

    a view, the first subset of the data set, and the second subset of the data set; and

    joining the first subset of the data set and the second subset of the data set when the one or more historical queries including an “

    or”

    operator are determined to have a greater size than the one or more historical queries including an “

    and”

    operator.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×