Method and apparatus for training an automated speech recognition-based system
First Claim
1. A method for building a training set of token-response pairings for an automated speech-recognition-based system, comprising the steps of:
- (a) for each response in a plurality of possible responses, calculating, based on an expected phrase coverage for said each response and a probability of occurrence for said each response, a benefit that would be achieved by adding to the training set a token-response pairing for said each response;
(b) identifying a maximum benefit response, said maximum benefit response being equal to the response from the plurality of possible responses having the maximum benefit;
(c) adding to the training set, a token-response pairing containing the maximum benefit response;
(d) incrementing a current phrase coverage for the training set by an amount equal to the product of the expected phrase coverage for the number of token-response pairings in the training set that contain the maximum benefit response, and the probability of occurrence of the maximum benefit response; and
(e) repeating steps (a) to (d) until the current phrase coverage is greater than or equal to a target phrase coverage.
5 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for building a training set for an automated speech recognition-based system, which determines the statistically optimal number of frequently requested responses to automate in order to achieve a desired automation rate. The invention may be used to select the appropriate tokens and responses to train the system and to achieve a desired “phrase coverage” for all of the many different ways human beings may phrase a request that calls for one of a plurality of frequently-requested responses. The invention also determines the statistically optimal number of tokens (spoken requests) required to train a speech recognition-based system to achieve the desired phrase coverage and optimal allocation of tokens over the set of responses that are to be automated.
58 Citations
40 Claims
-
1. A method for building a training set of token-response pairings for an automated speech-recognition-based system, comprising the steps of:
-
(a) for each response in a plurality of possible responses, calculating, based on an expected phrase coverage for said each response and a probability of occurrence for said each response, a benefit that would be achieved by adding to the training set a token-response pairing for said each response; (b) identifying a maximum benefit response, said maximum benefit response being equal to the response from the plurality of possible responses having the maximum benefit; (c) adding to the training set, a token-response pairing containing the maximum benefit response; (d) incrementing a current phrase coverage for the training set by an amount equal to the product of the expected phrase coverage for the number of token-response pairings in the training set that contain the maximum benefit response, and the probability of occurrence of the maximum benefit response; and (e) repeating steps (a) to (d) until the current phrase coverage is greater than or equal to a target phrase coverage. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A system for building a training set of token-response pairings for an automated speech-recognition-based system, comprising:
-
means for calculating, for each response in a plurality of possible responses, a benefit that would be achieved by adding to the training set a token-response pairing for said each response; means for identifying a maximum benefit response, said maximum benefit response being equal to the response from the plurality of possible responses having the maximum benefit; means for adding to the training set, a token-response pairing containing the maximum benefit response; and means for incrementing a current phrase coverage for the training set by an amount equal to the product of the expected phrase coverage for the number of token-response pairings in the training set that contain the maximum benefit response, and the probability of occurrence of the maximum benefit response. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34)
-
-
35. A system for generating a training set of token-response pairings for an automated speech-recognition-based system, comprising:
-
a phrase coverage processor module configured to calculate a phrase coverage associated with a response out of a plurality of possible responses; a probability of occurrence module configured to compute, responsive to a prior collection of token-response pairings, a statistical probability that said response will occur in a predetermined number of responses; a benefit processor configured to determine, responsive to the phrase coverage processor module and the probability of occurrence module, a benefit that would be achieved by adding a token-response pairing to the training set, and a maximum benefit response, said maximum benefit response being equal to the response from the plurality of responses having maximum benefit; and a training set generation module configured to add to the training set a token response pairing from the supply set containing the maximum benefit response. - View Dependent Claims (36, 37, 38, 39, 40)
-
Specification