Method of refining statistical pattern recognition models and statistical pattern recognizers
First Claim
1. A method of operating a pattern recognition system for refining a plurality of statistical pattern recognition models that are used for statistical pattern recognition, the method including:
- reading in initial values of a set of parameters for said plurality of statistical pattern recognition models;
reading a training data set that includes a plurality of training data items including training data items for each of said plurality of statistical pattern recognition models, along with a transcribed identity for each of said plurality of training data items;
obtaining feature vectors from each of the plurality of training data items;
using a processor to perform an optimization routine for optimizing an objective function in order to find refined values of said set of parameters for said plurality of said statistical pattern recognition models corresponding to an extremum of said objective function, wherein said objective function is dynamically defined for each of a succession of iterations of said optimization routine to include a subexpression for each kth item of training data in, at least, a subset of said plurality of training data items that is defined by, at least, a first criterion that requires that said transcribed identity does not match a recognized identity for said kth item of training data, and a second criterion that requires that there is not a gross discrepancy between said transcribed identity and said recognized identity, wherein each subexpression depends on a relative magnitude of a first probability score compared to a second probability score, wherein said first probability score is based on a value of a first statistical pattern recognition model corresponding to said recognized identity of said kth item of training data evaluated with said one or more feature vectors obtained from said kth item of training data and said second probability score is based on a value of a second statistical pattern recognition model corresponding to said transcribed identity of said kth item of training data evaluated with said one or more feature vectors obtained from said kth item of training data; and
using the refined statistical pattern recognition models to recognize a pattern.
4 Assignments
0 Petitions
Accused Products
Abstract
A device (800) performs statistical pattern recognition using model parameters that are refined by optimizing an objective function that includes a term for many items of training data for which recognition errors occur wherein each term depends on a relative magnitude of a first score for a recognition result for an item of training data and a second score calculated by evaluating a statistical pattern recognition model identified by a transcribed identity of the training data item with feature vectors extracted from the item of training data. The objective function does not include terms for items of training data for which there is a gross discrepancy between a transcribed identity and a recognized identity. Gross discrepancies can be detected by probability score or pattern identity comparisons. Terms, of the objective function are weighted based on the type of recognition error and weights can be increased for high priority patterns.
-
Citations
22 Claims
-
1. A method of operating a pattern recognition system for refining a plurality of statistical pattern recognition models that are used for statistical pattern recognition, the method including:
-
reading in initial values of a set of parameters for said plurality of statistical pattern recognition models; reading a training data set that includes a plurality of training data items including training data items for each of said plurality of statistical pattern recognition models, along with a transcribed identity for each of said plurality of training data items; obtaining feature vectors from each of the plurality of training data items; using a processor to perform an optimization routine for optimizing an objective function in order to find refined values of said set of parameters for said plurality of said statistical pattern recognition models corresponding to an extremum of said objective function, wherein said objective function is dynamically defined for each of a succession of iterations of said optimization routine to include a subexpression for each kth item of training data in, at least, a subset of said plurality of training data items that is defined by, at least, a first criterion that requires that said transcribed identity does not match a recognized identity for said kth item of training data, and a second criterion that requires that there is not a gross discrepancy between said transcribed identity and said recognized identity, wherein each subexpression depends on a relative magnitude of a first probability score compared to a second probability score, wherein said first probability score is based on a value of a first statistical pattern recognition model corresponding to said recognized identity of said kth item of training data evaluated with said one or more feature vectors obtained from said kth item of training data and said second probability score is based on a value of a second statistical pattern recognition model corresponding to said transcribed identity of said kth item of training data evaluated with said one or more feature vectors obtained from said kth item of training data; and using the refined statistical pattern recognition models to recognize a pattern. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method of operating a pattern recognition system for refining a plurality of statistical pattern recognition models that are used for statistical pattern recognition, the method including:
-
reading in initial values of a set of parameters for said plurality of statistical pattern recognition models; reading a training data set that includes a plurality of training data items including training data items for each of said plurality of statistical pattern recognition models, along with a transcribed identity for each of said plurality of training data items; obtaining feature vectors from each of the plurality of training data items; using a processor to perform an optimization routine for optimizing an objective function in order to find refined values of said set of parameters for said plurality of said statistical pattern recognition models corresponding to an extremum of said objective function, wherein said objective function is dynamically defined for each of a succession of iterations of said optimization routine to include a subexpression for each kth item of training data in, at least, a subset of said plurality of training data items that is defined by, at least, a first criterion that requires that said transcribed identity does not match a recognized identity for said kth item of training data, wherein each subexpression depends on a relative magnitude of a first probability score compared to a second probability score, wherein said first probability score is based on a value of a first statistical pattern recognition model corresponding to said recognized identity of said kth item of training data evaluated with said one or more feature vectors obtained from said kth item of training data and said second probability score is based on a value of a second statistical pattern recognition model corresponding to said transcribed identity of said kth item of training data evaluated with said one or more feature vectors obtained from said kth item of training data, wherein a value of said subexpression of said objective function for each kth item of training data is increased by applying an emphasis function that is determined based, at least in part, on said transcribed identity; and using the refined statistical pattern recognition models to recognize a pattern. - View Dependent Claims (15, 16, 17, 18, 19)
-
-
20. A method of performing statistical pattern recognition comprising a method of operating a pattern recognition system for refining a plurality of statistical pattern recognition models that are used for statistical pattern recognition, the method including:
-
reading in initial values of a set of parameters for said plurality of statistical pattern recognition models; reading a set of training data including a plurality of items of training data each of which is identified by a transcribed identity; obtaining one or more feature vectors from each kth item of training data in said set of training data; using a processor while one or more stopping criteria is not met to; for each kth item of training data in said set of training data training data; performing pattern recognition on said kth item of training data using a current set of values of said set of parameters for said plurality of statistical pattern recognition models in order to determine a recognized identity for said kth item of training data by finding a first statistical pattern recognition model, corresponding to said recognized identity, among said plurality of statistical pattern recognition models that yields a highest score when evaluated with said one or more feature vectors obtained from said kth item of training data; if a training data qualification criteria, that requires that said recognized identity does not match said transcribed identity, and that there is not a gross discrepancy between said transcribed identity and said recognized identity, is met for said kth item of training data; evaluating a gradient of a function that depends on a relative magnitude of said first statistical pattern recognition model compared to a second statistical pattern recognition model corresponding to said transcribed identity of said kth item of training data with respect to model parameters for said first statistical pattern recognition model and of said second statistical pattern recognition model at said one or more feature vectors obtained from said kth item of training data to obtain a plurality of gradient component summands; taking a plurality of component-by-component sums of said plurality of gradient component summands by summing over a subset of said set training data for which said training data qualification criteria is met; using said plurality of component-by-component sums of said plurality of gradient component summands in a gradient-based optimization method to update said current set of values of said set of parameters for said plurality of statistical pattern recognition models; when said one or more stopping criteria is met, storing said current set of values of said set of parameters for said plurality of statistical pattern recognition models; and using the refined statistical pattern recognition models to recognize a pattern.
-
-
21. An apparatus for performing statistical pattern recognition, the apparatus comprising:
-
a data input for inputting data including one or more unknown patterns; a processor coupled to said data input for receiving said data; a memory including programming instructions coupled to said processor, wherein said processor is programmed by said programming instructions to perform statistical pattern recognition on said data using a plurality of statistical pattern recognition models that include model parameters that have values resulting from a process comprising; reading in initial values of a set of parameters for said plurality of statistical pattern recognition models; reading a training data set that includes a plurality of training data items including training data items for each of said plurality of statistical pattern recognition models, along with a transcribed identity for each of said plurality of training data items; obtaining feature vectors from each of the plurality of training data items; invoking an optimization routine for optimizing an objective function in order to find refined values of said set of parameters for said plurality of said statistical pattern recognition models corresponding to an extremum of said objective function, wherein said objective function is dynamically defined for each of a succession of iterations of said optimization routine to include a subexpression for each kth item of training data in, at least, a subset of said plurality of training data items that is defined by, at least, a first criteria that requires that said transcribed identity does not match a recognized identity for said kth item of training data, and a second criteria that requires that there is not a gross discrepancy between said transcribed identity and said recognized identity, wherein each subexpression depends on a relative magnitude of a first probability score compared to a second probability score, wherein said first probability score is based on a value of a first statistical pattern recognition model corresponding to said recognized identity of said kth item of training data evaluated with said one or more feature vectors obtained from said kth item of training data and said second probability score is based on a value of a second statistical pattern recognition model corresponding to said transcribed identity of said kth item of training data evaluated with said one or more feature vectors obtained from said kth item of training data; and using the refined statistical pattern recognition models to recognize a pattern.
-
-
22. A computer readable medium storing programming instructions for operating a pattern recognition system for refining a plurality of statistical pattern recognition models that are used for statistical pattern recognition including programming instructions for:
-
reading in initial values of a set of parameters for said plurality of statistical pattern recognition models; reading a training data set that includes a plurality of training data items including training data items for each of said plurality of statistical pattern recognition models, along with a transcribed identity for each of said plurality of training data items; obtaining feature vectors from each of the plurality of training data items; using a processor to perform an optimization routine for optimizing an objective function in order to find a plurality of values of parameters for said plurality of said statistical pattern recognition models corresponding to an extremum of said objective function, wherein said objective function is dynamically defined for each of a succession of iterations of said optimization routine to include a subexpression for each kth item of training data in, at least, a subset of said plurality of training data items that is defined by, at least, a first criteria that requires that said transcribed identity does not match a recognized identity for said kth item of training data, wherein each subexpression depends on a relative magnitude of a first probability score compared to a second probability score, wherein said first probability score is based on a value of a first statistical pattern recognition model corresponding to said recognized identity of said kth item of training data evaluated with one or more feature vectors obtained from said kth item of training data and said second probability score is based on a value of a second statistical pattern recognition model corresponding to said transcribed identity of said kth item of training data evaluated with said one or more feature vectors obtained from said kth item of training data, wherein a value of said subexpression of said objective function for each kth item of training data is increased by applying an emphasis function that is determined based, at least in part, on said transcribed identity; and using the refined statistical pattern recognition models to recognize a pattern.
-
Specification