×

Adaptive aggregation: improving the performance of grouping and duplicate elimination by avoiding unnecessary disk access

  • US 8,352,470 B2
  • Filed: 05/23/2008
  • Issued: 01/08/2013
  • Est. Priority Date: 05/23/2008
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method for use with information stored in blocks on a storage medium, comprising:

  • using an aggregation method, reading blocks of the information from the storage medium into a memory of the computer until the memory is full or until all the information has been read into the memory;

    removing duplicates of the blocks as each respective disk block is read into the memory of the computer;

    determining a number, k, of blocks to be written back to the storage medium from the memory, wherein k is an estimate of how many blocks need to be written back to the storage medium, based on an estimate of the extent of duplication within the information;

    selecting k blocks from the memory, sorting the selected blocks, ranking a group of the selected k blocks based on a number of early aggregation operations done for a group of the selected k blocks with a same value, and writing the sorted blocks with lowest rank as a new sublist to the storage medium;

    iterating the steps of reading, determining, selecting, sorting, and writing sublists;

    merging the sublists to form an aggregation result; and

    outputting partial results of the forming of the aggregation result without writing the partial results to disk.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×