Reclassification of training data to improve classifier accuracy
First Claim
1. A method of creating a statistical classification model for use with a natural language understanding system, the method comprising:
- via a processor, processing training data using an existing statistical classification model;
via the processor, selecting sentences of the training data correctly classified into a selected class of the existing statistical classification model;
via the processor, assigning each selected sentence of the training data to a fringe group or a core group according to confidence score;
via the processor, updating the training data by associating the fringe group with a fringe subclass of the selected class and the core group with a core subclass of the selected class;
via the processor, building a new statistical classification model from the updated training data; and
via the processor, outputting the new statistical classification model.
1 Assignment
0 Petitions
Accused Products
Abstract
A method of creating a statistical classification model for a classifier within a natural language understanding system can include processing training data using an existing statistical classification model. Sentences of the training data correctly classified into a selected class of the statistical classification model can be selected. The selected sentences of the training data can be assigned to a fringe group or a core group according to confidence score. The training data can be updated by associating the fringe group with a fringe subclass of the selected class and the core group with a core subclass of the selected class. A new statistical classification model can be built from the updated training data. The new statistical classification model can be output.
-
Citations
17 Claims
-
1. A method of creating a statistical classification model for use with a natural language understanding system, the method comprising:
-
via a processor, processing training data using an existing statistical classification model; via the processor, selecting sentences of the training data correctly classified into a selected class of the existing statistical classification model; via the processor, assigning each selected sentence of the training data to a fringe group or a core group according to confidence score; via the processor, updating the training data by associating the fringe group with a fringe subclass of the selected class and the core group with a core subclass of the selected class; via the processor, building a new statistical classification model from the updated training data; and via the processor, outputting the new statistical classification model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A computer-readable storage comprising
computer-usable program code that creates a statistical classification model for a classifier within a natural language understanding system, the computer-readable storage comprising: -
computer-usable program code that processes training data using an existing statistical classification model; computer-usable program code that selects sentences of the training data correctly classified into a selected class of the existing statistical classification model; computer-usable program code that assigns each selected sentence of the training data to a fringe group or a core group according to confidence score; computer-usable program code that updates the training data by associating the fringe group with a fringe subclass of the selected class and the core group with a core subclass of the selected class; computer-usable program code that builds a new statistical classification model from the updated training data; and computer-usable program code that outputs the new statistical classification model, wherein the computer-readable storage is not a transitory, propagating signal per se. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
Specification