SYSTEM AND METHOD FOR MEASURING THE QUALITY OF DOCUMENT SETS
First Claim
1. A method for optimizing results returned from interaction with a collection of information, the method comprising the acts of:
- establishing criteria associated with at least one operation on a collection of information, wherein the criteria is based, at least in part, on a measurement of the distinctiveness of a set of results;
determining the set of results from interaction with a collection of information;
modifying the set of results according to the at least one operation in response to a determination that the set of results matches the criteria; and
outputting a modified result.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are described that calculate the interestingness of a set of one or more records in a database, either absolutely (i.e., compared to an overall collection of records) or relative to some other set of records. In one embodiment, the measure is a relative entropy value that has been normalized. Various applications of the measure are described in the context of an information retrieval system. These applications include, for example, guiding query interpretation, guiding view selection and summarization, intelligent ranges, event detection, concept triggers and interpreting user actions, hierarchy discovery, and adaptive data mining.
68 Citations
52 Claims
-
1. A method for optimizing results returned from interaction with a collection of information, the method comprising the acts of:
-
establishing criteria associated with at least one operation on a collection of information, wherein the criteria is based, at least in part, on a measurement of the distinctiveness of a set of results; determining the set of results from interaction with a collection of information; modifying the set of results according to the at least one operation in response to a determination that the set of results matches the criteria; and outputting a modified result. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. A method for information processing, comprising:
-
generating a salience measure that measures a quality of a set of documents based on their ambiguity relative to similarly sized sets of documents; and using the distinctiveness value to take a given action in an information retrieval system. - View Dependent Claims (26, 27)
-
-
28. A computer-readable medium having computer-readable instructions stored thereon that define instructions that, as a result of being executed by a computer, instruct the computer to perform a method for optimizing results returned from interaction with a collection of information, the method comprising:
-
establishing criteria associated with at least one operation on a collection of information, wherein the criteria is based, at least in part, on a measurement of the distinctiveness of a set of results; determining the set of results from interaction with a collection of information; modifying the set of results according to the at least one operation in response to a determination that the set of results matches the criteria; and outputting a modified result.
-
-
29. A system for optimizing results returned from interaction with a collection of information, the system comprising:
-
a rules engine adapted to establish criteria associated with at least one operation on a collection of information, wherein execution of the operation is based on a measurement of the distinctiveness of the set of results; a measurement engine adapted to measure the distinctiveness of a set of results; a retrieval engine adapted to return a set of results from a collection of information in response to interaction with the collection of information; a modification engine adapted to modify the set of results according to the at least one operation in response to a determination that the set of results matches the established criteria; and a output engine adapted to output the modified result. - View Dependent Claims (30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52)
-
Specification