Efficient empirical determination, computation, and use of acoustic confusability measures
First Claim
1. A method for generating an acoustic confusability measure, said method comprising the steps of:
- receiving as input a corpus, comprising a set of utterances with corresponding reliable transcriptions;
recognizing, via an automatic speech recognition system, at least one utterance among the set of utterances to yield a recognized utterance, wherein said recognized utterance includes at least one decoded frame sequence;
coalescing identical sequential phonemes of said at least one decoded frame sequence to yield at least one decoded phoneme sequence;
determining, for each corresponding reliable transcription, at least one pronunciation, wherein said at least one pronunciation includes at least one true phoneme sequence;
generating as output said recognized corpus comprising for each said recognized utterance at least said at least one decoded phoneme sequence and said at least one true phoneme sequence; and
generating an empirically derived acoustic confusability measure from said recognized corpus, said empirically derived acoustic confusability measure comprising a family of probability models Π
={p(d|t)} wherein each of d and t are phonemes drawn from an augmented phoneme alphabet Φ
′
.
1 Assignment
0 Petitions
Accused Products
Abstract
Efficient empirical determination, computation, and use of an acoustic confusability measure comprises: (1) an empirically derived acoustic confusability measure, comprising a means for determining the acoustic confusability between any two textual phrases in a given language, where the measure of acoustic confusability is empirically derived from examples of the application of a specific speech recognition technology, where the procedure does not require access to the internal computational models of the speech recognition technology, and does not depend upon any particular internal structure or modeling technique, and where the procedure is based upon iterative improvement from an initial estimate; (2) techniques for efficient computation of empirically derived acoustic confusability measure, comprising means for efficient application of an acoustic confusability score, allowing practical application to very large-scale problems; and (3) a method for using acoustic confusability measures to make principled choices about which specific phrases to make recognizable by a speech recognition application.
-
Citations
7 Claims
-
1. A method for generating an acoustic confusability measure, said method comprising the steps of:
-
receiving as input a corpus, comprising a set of utterances with corresponding reliable transcriptions; recognizing, via an automatic speech recognition system, at least one utterance among the set of utterances to yield a recognized utterance, wherein said recognized utterance includes at least one decoded frame sequence; coalescing identical sequential phonemes of said at least one decoded frame sequence to yield at least one decoded phoneme sequence; determining, for each corresponding reliable transcription, at least one pronunciation, wherein said at least one pronunciation includes at least one true phoneme sequence; generating as output said recognized corpus comprising for each said recognized utterance at least said at least one decoded phoneme sequence and said at least one true phoneme sequence; and generating an empirically derived acoustic confusability measure from said recognized corpus, said empirically derived acoustic confusability measure comprising a family of probability models Π
={p(d|t)} wherein each of d and t are phonemes drawn from an augmented phoneme alphabet Φ
′
. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
Specification