Reclassification of Training Data to Improve Classifier Accuracy
First Claim
1. A method of creating a statistical classification model for use with a natural language understanding system, the method comprising:
- processing training data using an existing statistical classification model;
selecting sentences of the training data correctly classified into a selected class of the existing statistical classification model;
assigning each selected sentence of the training data to a fringe group or a core group according to confidence score;
updating the training data by associating the fringe group with a fringe subclass of the selected class and the core group with a core subclass of the selected class;
building a new statistical classification model from the updated training data; and
outputting the new statistical classification model.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of creating a statistical classification model for a classifier within a natural language understanding system can include processing training data using an existing statistical classification model. Sentences of the training data correctly classified into a selected class of the statistical classification model can be selected. The selected sentences of the training data can be assigned to a fringe group or a core group according to confidence score. The training data can be updated by associating the fringe group with a fringe subclass of the selected class and the core group with a core subclass of the selected class. A new statistical classification model can be built from the updated training data. The new statistical classification model can be output.
57 Citations
20 Claims
-
1. A method of creating a statistical classification model for use with a natural language understanding system, the method comprising:
-
processing training data using an existing statistical classification model; selecting sentences of the training data correctly classified into a selected class of the existing statistical classification model; assigning each selected sentence of the training data to a fringe group or a core group according to confidence score; updating the training data by associating the fringe group with a fringe subclass of the selected class and the core group with a core subclass of the selected class; building a new statistical classification model from the updated training data; and outputting the new statistical classification model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method of creating a statistical classification model for use with a natural language understanding system, the method comprising:
-
processing training data using an existing statistical classification model; receiving a user input specifying at least one parameter for assigning sentences of the training data correctly classified into a selected class to a fringe group or a core group; updating the training data by associating each group with a different subclass; building a new statistical classification model from the updated training data; and outputting the new statistical classification model. - View Dependent Claims (12, 13)
-
-
14. A computer program product comprising:
-
a computer-usable medium comprising computer-usable program code that creates a statistical classification model for a classifier within a natural language understanding system, the computer-usable medium comprising; computer-usable program code that processes training data using an existing statistical classification model; computer-usable program code that selects sentences of the training data correctly classified into a selected class of the existing statistical classification model; computer-usable program code that assigns each selected sentence of the training data to a fringe group or a core group according to confidence score; computer-usable program code that updates the training data by associating the fringe group with a fringe subclass of the selected class and the core group with a core subclass of the selected class; computer-usable program code that builds a new statistical classification model from the updated training data; and computer-usable program code that outputs the new statistical classification model. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification