×

Methods and apparatus for ranking uncertain data in a probabilistic database

  • US 8,825,640 B2
  • Filed: 03/16/2009
  • Issued: 09/02/2014
  • Est. Priority Date: 03/16/2009
  • Status: Active Grant
First Claim
Patent Images

1. A method implemented by a database server to rank non-deterministic data stored in a database, the method comprising:

  • storing a set of data tuples representing a plurality of possible instantiations of the non-deterministic data in a database memory, a first data tuple in the set of data tuples taking on a first one of a set of possible data tuple instantiations selectable according to a first probability representing a likelihood of occurrence of the first one of the set of possible data tuple instantiations, respective ones of the possible instantiations of the non-deterministic data being realized when the database server selects a respective combination of possible data tuple instantiations for a combination of data tuples in the set of data tuples, the respective ones of the possible instantiations of the non-deterministic data being associated with respective instantiation probabilities;

    in response to a query requesting a first number of data tuples collectively exhibiting a highest ranking among the set of data tuples, determining, using a processor, a ranking of the set of data tuples for output by the database server by determining an expected rank for the first data tuple, the expected rank for the first data tuple corresponding to a combination of multiple weighted component ranks for the first data tuple across the plurality of possible instantiations of the non-deterministic data, respective ones of the weighted component ranks for the first data tuple comprising a component rank, which represents a respective rank of the first data tuple in a respective one of the possible instantiations of the non-deterministic data, weighted by the respective instantiation probability for the respective one of the possible instantiations of the non-deterministic data; and

    outputting a first ranked subset of the data tuples containing the first number of data tuples determined to exhibit the highest ranking among the set of data tuples, wherein the first ranked subset of the data tuples exhibits all of an exactness property, a containment property, a unique ranking property, a value invariance property and a stability property.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×