SYSTEM AND METHOD FOR MEASURING THE QUALITY OF DOCUMENT SETS
First Claim
1. A method for measuring the distinctiveness of a result generated from a collection of information, wherein the result is comprised of elements associated with the collection of information, the method comprising:
- analyzing the result to obtain a statistical distribution of at least one identifying characteristic within the result;
generating a measurement of distinctiveness for the result based on the statistical distribution of the at least one identifying characteristic; and
comparing the measured statistical distribution against a baseline statistical distribution.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are described that calculate the interestingness of a set of one or more records in a database, either absolutely (i.e., compared to an overall collection of records) or relative to some other set of records. In one embodiment, the measure is a relative entropy value that has been normalized. Various applications of the measure are described in the context of an information retrieval system. These applications include, for example, guiding query interpretation, guiding view selection and summarization, intelligent ranges, event detection, concept triggers and interpreting user actions, hierarchy discovery, and adaptive data mining.
-
Citations
54 Claims
-
1. A method for measuring the distinctiveness of a result generated from a collection of information, wherein the result is comprised of elements associated with the collection of information, the method comprising:
-
analyzing the result to obtain a statistical distribution of at least one identifying characteristic within the result; generating a measurement of distinctiveness for the result based on the statistical distribution of the at least one identifying characteristic; and comparing the measured statistical distribution against a baseline statistical distribution. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
-
-
27. A method, operative in an information retrieval system, for improving a user'"'"'s interaction with data stored and accessible from the system, comprising:
-
relating a first and second sets of items in the information retrieval system to generate a salience measure; and using the salience measure to guide a subsequent user interaction with the information retrieval system. - View Dependent Claims (28, 29, 30, 31)
-
-
32. A computer-readable medium having computer-readable instructions stored thereon that define instructions that, as a result of being executed by a computer, instruct the computer to perform a method measuring the distinctiveness of a result generated from a collection of information, wherein the result is comprised of elements associated with the collection of information, the method comprising the acts of:
-
analyzing the result to obtain a statistical distribution of at least one identifying characteristic within the result; generating a measurement of distinctiveness for the result based on the statistical distribution of the at least one identifying characteristic; and comparing the measured statistical distribution against a baseline statistical distribution.
-
-
33. A system for measuring the distinctiveness of a result generated from a collection of information, wherein the result is comprised of elements associated with the collection of information, the system comprising:
-
an analysis component adapted to obtain a statistical distribution of at least one identifying characteristic; a measurement component adapted to generate a measurement of distinctiveness for the result based on the statistical distribution of the at least one identifying characteristic; and a comparison component adapted to compare the measured statistical distribution against a baseline statistical distribution. - View Dependent Claims (34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54)
-
Specification