Testing and tuning of automatic speech recognition systems using synthetic inputs generated from its acoustic models
First Claim
1. A speech recognition testing system comprising:
- a speech recognizer configured to provide an output text based upon feature vectors;
a pronunciation tool configured to provide a pronunciation for a provided text having at least one word; and
a vector generator configured to generate a sequence of feature vectors from the provided pronunciation for the text.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method of testing and tuning a speech recognition system by providing pronunciations to the speech recognizer. First a text document is provided to the system and converted into a sequence of phonemes representative of the words in the text. The phonemes are then converted to model units, such as Hidden Markov Models. From the models a probability is obtained for each model or state, and feature vectors are determined. The feature vector matching the most probable vector for each state is selected for each model. These ideal feature vectors are provided to the speech recognizer, and processed. The end result is compared with the original text, and modifications to the system can be made based on the output text.
211 Citations
25 Claims
-
1. A speech recognition testing system comprising:
-
a speech recognizer configured to provide an output text based upon feature vectors;
a pronunciation tool configured to provide a pronunciation for a provided text having at least one word; and
a vector generator configured to generate a sequence of feature vectors from the provided pronunciation for the text. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method of testing a speech recognition system, comprising:
-
receiving a text containing at least one word;
generating a pronunciation for the text with a pronunciation tool;
generating a sequence of vectors for the pronunciation;
providing the sequence of vectors to the speech recognition system;
outputting text from the speech recognition system in response to the provided sequence of vectors. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
-
Specification