×

Computer automated discovery of interestingness in faceted search

  • US 7,493,319 B1
  • Filed: 05/09/2008
  • Issued: 02/17/2009
  • Est. Priority Date: 10/22/2007
  • Status: Active Grant
First Claim
Patent Images

1. A computer program produce that includes a computer readable storage media, the media having stored thereon a sequence of instructions, when executed by the processor, causes the processor to perform an enhanced on-line analytical processing faceted search, by:

  • receiving at least one keyword as a constraint value for a query;

    receiving input in regards to a selection of a probability baseline distribution value, wherein the probability baseline distribution value is determined as a product of an absolute baseline or a relative baseline;

    receiving input in regard to a selection of a metric to determine a distance between a normalized probability distribution of search results on a facet set and a baseline distribution value;

    determining a set of candidate facet, wherein encoding said set of candidate facet by pre-pending said set of candidate facet with a path from root to said set of candidate facet in a facet hierarchy, the candidate facet sets being based upon the keyword constraint value;

    determining a probability distribution of the search results on a facet set and a baseline distribution value utilizing bit-set trees, wherein the utilization of bit-set trees contributes to the increased speed in determining the probability distribution of the search results;

    eliminating uninteresting candidate facet combinations in an instance that a number of values within the probability distribution of the search results exceed a predetermined threshold, wherein said uninteresting candidate facet combinations are not within the same said facet hierarchy;

    determining most interesting facet combinations;

    returning for each interesting facet combination, a small number of most interesting values in it, the small number of most interesting values being the values whose associated probability differs the most between the query distribution and the baseline distribution; and

    approximating the distance between the normalized probability distribution of search results on a facet set and a baseline distribution utilizing a random sample from the probability distribution.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×