×

Efficient top-K query evaluation on probabilistic data

  • US 7,814,113 B2
  • Filed: 11/05/2007
  • Issued: 10/12/2010
  • Est. Priority Date: 11/07/2006
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method for efficiently automatically determining a number of top-rated probabilistic entities selected from a group of probabilistic entities to satisfy a condition, wherein the top-rated probabilistic entities are rated on a criteria that is computed for a set of probabilistic entities that satisfy the condition, the method comprising the steps of:

  • (a) determining an initial range of criteria for each probabilistic entity in the set of probabilistic entities, wherein determining the initial range includes performing a probabilistic, simulation-based computation on each probabilistic entity in the set of entities;

    (b) computing a current critical range of criteria, based upon the ranges of criteria that were determined for each probabilistic entity;

    (c) selecting a subset of probabilistic entities from the set on which to run further iterative computations to determine a refined range of criteria for each probabilistic entity of the subset of probabilistic entities, wherein selection of probabilistic entities to be included in the subset is based upon the range of criteria previously determined for the probabilistic entities, and wherein the subset of probabilistic entities includes an entity with first data and another entity with second data that has a lower computed probability of being accurate than a computed probability of the first data being accurate;

    (d) repeating steps (b) and (c) until a current critical range does not include any portion of a refined range of criteria for any of the probabilistic entities in the subset, the number of probabilistic entities that are above the current critical range then comprising the number of top-rated probabilistic entities; and

    (e) presenting the number of top-rated probabilistic entities to a user, wherein the step of computing the current critical range of criteria comprises the steps of;

    setting a lower critical bound for the current critical range of criteria based upon a topk refined lower bound, determined by running the computations on the probabilistic entities, where the topk refined lower bound is a kth largest refined lower bound of the probabilistic entities; and

    setting an upper critical bound for the current critical range based upon a topk+1 refined upper bound for the probabilistic entities, determined by running the computations on the probabilistic entities, where the topk+1 refined upper bound is a k +1th largest refined upper bound of the probabilistic entities.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×