Fast feature selection method and system for maximum entropy modeling
First Claim
Patent Images
1. A computer implemented method to select features for maximum entropy modeling for language and statistical processing, the method comprising:
- (a) determining gains of log likelihood for candidate features during an initialization stage;
(b) ranking the candidate features in an ordered list based on the determined gains;
(c) selecting a top-ranked feature in the ordered list with a highest gain;
(d) adjusting a maximum entropy model using the selected top-ranked feature;
(e) determining gains of log likelihood for only a first predefined number of top-ranked features;
(f) repeating steps (b) through (e) until a number of selected features equals a second predefined number;
(g) storing the second predefined number of selected top-ranked features and the adjusted model in a file.
3 Assignments
0 Petitions
Accused Products
Abstract
A method to select features for maximum entropy modeling in which the gains for all candidate features are determined during an initialization stage and gains for only top-ranked features are determined during each feature selection stage. The candidate features are ranked in an ordered list based on the determined gains, a top-ranked feature in the ordered list with a highest gain is selected, and the model is adjusted using the selected top-ranked feature.
25 Citations
17 Claims
-
1. A computer implemented method to select features for maximum entropy modeling for language and statistical processing, the method comprising:
-
(a) determining gains of log likelihood for candidate features during an initialization stage; (b) ranking the candidate features in an ordered list based on the determined gains; (c) selecting a top-ranked feature in the ordered list with a highest gain; (d) adjusting a maximum entropy model using the selected top-ranked feature; (e) determining gains of log likelihood for only a first predefined number of top-ranked features; (f) repeating steps (b) through (e) until a number of selected features equals a second predefined number; (g) storing the second predefined number of selected top-ranked features and the adjusted model in a file. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer implemented method to select features for maximum entropy modeling for language and statistical processing, the method comprising:
-
(a) computing gains of log likelihood of candidate features using a uniform distribution; (b) ordering the candidate features in an ordered list based on the computed gains; (c) selecting a top-ranked feature with a highest gain in the ordered list; (d) adjusting a maximum entropy model using the selected top-ranked feature; (e) removing the top-ranked feature from the ordered list so that a next-ranked feature in the ordered list becomes the top-ranked feature and marking all features as not ranked; (f) computing a gain of the top-ranked feature using the adjusted model; (g) comparing the gain of the top-ranked feature with a gain of the next-ranked feature in the ordered list; (h) if the gain of the top-ranked feature equals or is more than the gain of the next-ranked feature marking it as ranked and selecting the next-ranked feature that is not marked as ranked as the top-ranked feature; (i) if the gain of the top-ranked feature is less than the gain of the next-ranked feature, repositioning the top-ranked feature in the ordered list so that the next-ranked feature becomes the top-ranked feature; (j) repeating steps (f) through (i) until number of top-ranked features that are marked ranked equals a first predefined number; (k) repeating steps (c) through (j) until one of a number of selected features equals a second predefined number and a gain of a last-selected feature falls below a predefined value; and (l) storing the second predefined number of selected top-ranked features and the adjusted model in a file. - View Dependent Claims (10)
-
-
11. A processing system to perform maximum entropy modeling in which one or more candidate features derived from a corpus of data are incorporated into a maximum entropy model that predicts linguistic behavior, the system comprising:
-
a computer with at least one processor, a memory storing a program of instructions and a display device; a gain computation logic to determine gains of log likelihood for the candidate features during an initialization stage and to determine gains for only a first predefined number of top-ranked features during a feature selection stage; a feature ranking logic to rank features based on the determined gains; a feature selection logic to select a feature with a highest gain as a top-ranked feature; and a model adjustment logic to adjust the maximum entropy model using the selected top-ranked feature; wherein when the program is executed on the processor, a second predefined number of features with the highest gains are selected as the top-ranked features and included in the maximum entropy model; and the second predefined number of selected top-ranked features and the adjusted model are stored in a file. - View Dependent Claims (12, 13, 14, 15, 16)
-
-
17. A computer storage medium having a set of instructions executable by a processor to perform maximum entropy modeling in which one or more candidate features derived from a corpus of data are incorporated into a maximum entropy model that predicts linguistic behavior comprising instructions for:
-
(a) ordering candidate features based on gains of log likelihood computed using a uniform distribution to form an ordered list of candidate features; (b) selecting a top-ranked feature with a largest gain and adjusting the maximum entropy model for a next stage; (c) removing the top-ranked feature from the ordered list of the candidate features so that a next-ranked feature in the ordered list becomes the top-ranked feature and marking all features as not ranked; (d) computing a gain of the top-ranked feature using the adjusted model; (e) comparing the gain of the top-ranked feature with a gain of the next-ranked feature in the ordered list; (f) if the gain of the top-ranked feature equals or is more than the gain of the next-ranked feature marking it as ranked and selecting the next-ranked feature that is not marked as ranked as the top-ranked feature; (g) if the gain of the top-ranked feature is less than the gain of the next-ranked feature, repositioning the top-ranked feature in the ordered list so that the next-ranked feature becomes the top-ranked feature; (h) repeating steps (d) through (g) until number of top-ranked features that are marked ranked equals a first predefined number; (i) repeating steps (b) through (h) until one of a number of selected features reaches a second predefined number and a gain of a last-selected feature falls below a predefined value; and (j) storing the second predefined number of selected top-ranked features and the model in a file.
-
Specification