Method and apparatus for profile score threshold setting and updating
First Claim
1. A system for filtering documents, comprising:
- means for selecting a document profile and an expected document delivery ratio;
means for scoring a reference set of documents according to said document profile;
means for determining an assigned score threshold corresponding to said expected document delivery ratio;
means for determining a utility function by calculating a utility for each of said documents in said reference set;
means for determining a first utility threshold based on said utility function, wherein said first utility threshold (θ
opt) is the highest utility over said reference set;
means for determining a second utility threshold based on said utility function, wherein said second utility threshold (θ
zero) is the highest utility below said first utility threshold that has a non-positive utility over said reference set;
means for interpolating between said first utility threshold and said second utility threshold to determine an updated score threshold; and
means for filtering incoming documents based on said updated score threshold.
4 Assignments
0 Petitions
Accused Products
Abstract
A novel approach for filtering documents involves the use of delivery ratio threshold setting technique to set an initial profile score threshold and the use of beta-gamma regulation for dynamic threshold updating. A group of documents is scored pursuant to a user profile. The score for each document is indicative of the relevance of the corresponding document to the user profile. The score can be compared with a profile score threshold to decide if the document should be accepted or rejected. According to one aspect of the invention, the initial threshold is set to a score threshold that approximates an expected ratio of acceptable documents calibrated with respect to a set of reference documents. According to another aspect of the invention, the score threshold can be updated based on the accumulated example documents, user'"'"'s relevance judgment, and the user'"'"'s utility function. The accumulated example documents are first scored against a profile and a ranked list of scored documents is obtained. Each position at the ranked list corresponds to a candidate score threshold as well as a utility value computed based on the relevance status of the example documents. From these candidate threshold points, an optimal utility threshold and a zero utility threshold are determined. Using the optimal utility threshold and the zero utility threshold, a new utility threshold is calculated by interpolating between estimates of the optimal utility threshold and the zero utility threshold. This new utility threshold is used for subsequent information retrieval and filtering.
-
Citations
3 Claims
-
1. A system for filtering documents, comprising:
-
means for selecting a document profile and an expected document delivery ratio;
means for scoring a reference set of documents according to said document profile;
means for determining an assigned score threshold corresponding to said expected document delivery ratio;
means for determining a utility function by calculating a utility for each of said documents in said reference set;
means for determining a first utility threshold based on said utility function, wherein said first utility threshold (θ
opt) is the highest utility over said reference set;
means for determining a second utility threshold based on said utility function, wherein said second utility threshold (θ
zero) is the highest utility below said first utility threshold that has a non-positive utility over said reference set;
means for interpolating between said first utility threshold and said second utility threshold to determine an updated score threshold; and
means for filtering incoming documents based on said updated score threshold. - View Dependent Claims (2, 3)
-
-
3. The system, as in claim 2, wherein said means for interpolating calculates a according to the following formula:
Specification