×

METHOD FOR MAINTAINING A SAMPLE SYNOPSIS UNDER ARBITRARY INSERTIONS AND DELETIONS

  • US 20080154541A1
  • Filed: 12/22/2006
  • Published: 06/26/2008
  • Est. Priority Date: 12/22/2006
  • Status: Active Grant
First Claim
Patent Images

1. A method of incrementally maintaining a stable, bounded, uniform random sample S from a dataset R, in the presence of arbitrary insertions and deletions of an item t to the dataset R, and without accesses to the dataset R, the sample S having a maximum size of M items, the method comprising the steps of:

  • initializing to zero a counter c1 for uncompensated deletions that have been applied to sample S;

    initializing to zero a counter c2 for uncompensated deletions that have not been applied to sample S;

    while transactions comprising insertions and deletions are applied to dataset R, each transaction having a transaction type selected from a group consisting of an insertion transaction and a deletion transaction, performing steps A through C;

    A. receiving a transaction applied to dataset R for an item t;

    B. if the received transaction is an insertion transaction, thenif the total number c1+c2 of uncompensated deletions is zero, then performing the steps ofincluding the item t into the sample S with probability equal to the value of M divided by the maximum of size |R| and M, wherein |R| is the size of the dataset R just after insertion of item t,adding item t to sample S if |S|<

    M, andreplacing a randomly selected item in sample S by item t if |S|<

    =Motherwise if the total number c1+c2 of uncompensated deletions is non-zero, then performing the steps ofinserting item t into the sample S with probability equal to the value of c1/(c1+c2),decrementing by 1 the value of counter c1 if the item t is inserted into sample S; and

    decrementing by 1 the value of counter c2 if the item t is not inserted into sample S; and

    C. if the received transaction is a deletion transaction, thenif the deleted item t is in the sample S, then performing the steps ofremoving the deleted item t from S andincreasing the value of the counter c1 by one, otherwise, performing the step ofincreasing the value of the counter c2 by one.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×