Efficient empirical determination, computation, and use of acoustic confusability measures
First Claim
1. A method for determining an empirically derived acoustic confusability measure, comprising the steps of:
- using a computer for performing corpus processing by initially processing an original corpus, comprising both audio information and a true transcription thereof, with an automatic speech recognition system of interest once, one utterance at a time to produce a recognized corpus comprising a machine transcription of audio information; and
developing a family of phoneme confusability models by repeatedly processing said recognized corpus with said computer, after the corpus is initially processed by said automatic speech recognition system once, wherein each repetition comprises the steps of;
setting all phoneme pair counts to zero; and
analyzing to analyze each pair of phoneme sequences in said recognized corpus to collect information regarding the confusability of any two phonemes, wherein said information is collected by;
constructing a lattice from each said pair of phoneme sequences;
labeling each arc of the lattice with the appropriate value from the current family of decoding costs;
computing the minimum cost path through this lattice; and
traversing said minimum cost path and incrementing the phoneme pair count for each arc that is traversed; and
upon completion for each said pair of phoneme sequences of said minimum cost path traversal and associated incrementing of phoneme pair counts, using said accumulated phoneme pair counts to deliver a family of phoneme confusability models.
1 Assignment
0 Petitions
Accused Products
Abstract
Efficient empirical determination, computation, and use of an acoustic confusability measure comprises: (1) an empirically derived acoustic confusability measure, comprising a means for determining the acoustic confusability between any two textual phrases in a given language, where the measure of acoustic confusability is empirically derived from examples of the application of a specific speech recognition technology, where the procedure does not require access to the internal computational models of the speech recognition technology, and does not depend upon any particular internal structure or modeling technique, and where the procedure is based upon iterative improvement from an initial estimate; (2) techniques for efficient computation of empirically derived acoustic confusability measure, comprising means for efficient application of an acoustic confusability score, allowing practical application to very large-scale problems; and (3) a method for using acoustic confusability measures to make principled choices about which specific phrases to make recognizable by a speech recognition application.
-
Citations
5 Claims
-
1. A method for determining an empirically derived acoustic confusability measure, comprising the steps of:
-
using a computer for performing corpus processing by initially processing an original corpus, comprising both audio information and a true transcription thereof, with an automatic speech recognition system of interest once, one utterance at a time to produce a recognized corpus comprising a machine transcription of audio information; and developing a family of phoneme confusability models by repeatedly processing said recognized corpus with said computer, after the corpus is initially processed by said automatic speech recognition system once, wherein each repetition comprises the steps of; setting all phoneme pair counts to zero; and analyzing to analyze each pair of phoneme sequences in said recognized corpus to collect information regarding the confusability of any two phonemes, wherein said information is collected by; constructing a lattice from each said pair of phoneme sequences; labeling each arc of the lattice with the appropriate value from the current family of decoding costs; computing the minimum cost path through this lattice; and traversing said minimum cost path and incrementing the phoneme pair count for each arc that is traversed; and upon completion for each said pair of phoneme sequences of said minimum cost path traversal and associated incrementing of phoneme pair counts, using said accumulated phoneme pair counts to deliver a family of phoneme confusability models. - View Dependent Claims (2, 3, 4, 5)
-
Specification