SYSTEM AND METHOD FOR MEASURING THE QUALITY OF DOCUMENT SETS
First Claim
1. A method for organizing a database, the method comprising:
- analyzing the database for a statistical distribution of at least one identifying characteristic;
generating a measurement of distinctiveness based on the statistical distribution of the at least one identifying characteristic;
identifying at least one similar group of elements within the database based on the measurement of distinctiveness;
generating a descriptor associated with the identified at least one similar group of elements; and
organizing the database based on the descriptor.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods are described that calculate the interestingness of a set of one or more records in a database, either absolutely (i.e., compared to an overall collection of records) or relative to some other set of records. In one embodiment, the measure is a relative entropy value that has been normalized. Various applications of the measure are described in the context of an information retrieval system. These applications include, for example, guiding query interpretation, guiding view selection and summarization, intelligent ranges, event detection, concept triggers and interpreting user actions, hierarchy discovery, and adaptive data mining.
-
Citations
43 Claims
-
1. A method for organizing a database, the method comprising:
-
analyzing the database for a statistical distribution of at least one identifying characteristic; generating a measurement of distinctiveness based on the statistical distribution of the at least one identifying characteristic; identifying at least one similar group of elements within the database based on the measurement of distinctiveness; generating a descriptor associated with the identified at least one similar group of elements; and organizing the database based on the descriptor. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. A computer-readable medium having computer-readable instructions stored thereon that define instructions that, as a result of being executed by a computer, instruct the computer to perform a method for organizing a database, the method comprising the acts of:
-
analyzing the database for a statistical distribution of at least one identifying characteristic; generating a measurement of distinctiveness based on the statistical distribution of the at least one identifying characteristic; identifying at least one similar group of elements within the database based on the measurement of distinctiveness; generating a descriptor associated with the identified at least one similar group of elements; and organizing the database based on the descriptor.
-
-
24. A system for organizing a database, the system comprising:
-
an analysis component adapted to determine a measurement of distinctiveness based on a statistical distribution of at least one identifying characteristic; a generation component adapted to generate a descriptor for at least one element of the database based on the measurement of distinctiveness; and an organization component adapted to group a plurality of elements within the database based on the at least one description. - View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43)
-
Specification