×

Statistical representation of skewed data

  • US 7,386,536 B1
  • Filed: 12/31/2003
  • Issued: 06/10/2008
  • Est. Priority Date: 12/31/2003
  • Status: Active Grant
First Claim
Patent Images

1. A method for representing statistics about a table including one or more rows, each row including a respective value, the method including:

  • creating one or more histogram buckets, each histogram bucket including a width representing a respective range of values and a height representing a count of rows in the table having values in the range of values;

    creating one or more high-bias buckets, each high-bias bucket including one or more high-bias values up to a maximum number of high-bias values (F) that appear in a minimum percentage of rows in the table and for each high-bias value a number of rows that contain the high-bias value;

    where the minimum percentage of rows is computed using F and B, where B is the total number of buckets;

    repeating the following;

    (a) determining an average height of the histogram buckets;

    (b) determining a reclassification threshold based on the average height of the histogram buckets; and

    (c) concluding that a value associated with one of the one or more histogram buckets occurs in more rows of the table than the reclassification threshold, and, in response, concluding that the number of high-bias values associated with at least one of the one or more high-bias buckets has not reached the maximum number of high-bias values, and, in response, including the value in one of the high-bias buckets for which the number of high-bias values has not reached the maximum number of high-bias values;

    until no values included in any of the ranges of values associated with the histogram buckets occur in more than the reclassification threshold number of rows in the table; and

    saving in a memory the width and the height of each of the one or more histogram buckets and the one or more high-bias values and numbers of rows for each of the one or more high-bias buckets.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×