Efficient Empirical Determination, Computation, and Use of Acoustic Confusability Measures
First Claim
1. A computer implemented method for efficient empirical determination, computation, and use of an acoustic confusability measure, comprising the steps of:
- empirically deriving an acoustic confusability measure by determining acoustic confusability between at least any two textual phrases in a given language;
wherein said measure of acoustic confusability is empirically derived from examples of application of utterances to a specific speech recognition application;
iterating from an initial estimate of said acoustic confusability measure to improve said measure; and
using said acoustic confusability measure to make principled choices about which specific phrases to make recognizable by said speech recognition application.
1 Assignment
0 Petitions
Accused Products
Abstract
Efficient empirical determination, computation, and use of an acoustic confusability measure comprises: (1) an empirically derived acoustic confusability measure, comprising a means for determining the acoustic confusability between any two textual phrases in a given language, where the measure of acoustic confusability is empirically derived from examples of the application of a specific speech recognition technology, where the procedure does not require access to the internal computational models of the speech recognition technology, and does not depend upon any particular internal structure or modeling technique, and where the procedure is based upon iterative improvement from an initial estimate; (2) techniques for efficient computation of empirically derived acoustic confusability measure, comprising means for efficient application of an acoustic confusability score, allowing practical application to very large-scale problems; and (3) a method for using acoustic confusability measures to make principled choices about which specific phrases to make recognizable by a speech recognition application.
-
Citations
18 Claims
-
1. A computer implemented method for efficient empirical determination, computation, and use of an acoustic confusability measure, comprising the steps of:
-
empirically deriving an acoustic confusability measure by determining acoustic confusability between at least any two textual phrases in a given language; wherein said measure of acoustic confusability is empirically derived from examples of application of utterances to a specific speech recognition application; iterating from an initial estimate of said acoustic confusability measure to improve said measure; and using said acoustic confusability measure to make principled choices about which specific phrases to make recognizable by said speech recognition application.
-
-
2. A computer implemented method for determining an empirically derived acoustic confusability measure, comprising the steps of:
-
performing corpus processing by passing an original corpus through an automatic speech recognition system of interest once, one utterance at a time; and developing a family of phoneme confusability models by repeatedly passing over said recognized corpus, analyzing each pair of phoneme sequences to collect information regarding the confusability of any two phonemes, at each step delivering an improved family of confusability models. - View Dependent Claims (3, 4, 5, 6)
-
-
7. In a computer implemented method for determining an empirically derived acoustic confusability measure, a corpus processing method comprising the steps of:
-
receiving an input utterance; performing corpus processing by passing an original corpus through an automatic speech recognition system of interest once, one utterance at a time; wherein said corpus comprises pairs of utterances and transcriptions, and for each pair in said corpus; applying a recognizer to the input utterance, yielding as an output a decoded frame sequence and a confidence score; coalescing identical sequential phonemes in said decoded frame sequence to obtain a decoded phoneme sequence by replacing each subsequence of identical contiguous phonemes that appear in said sequence by a single phoneme of a same type; generating a pronunciation of a transcription of an utterance by lookup in a dictionary of the automatic recognition system, or by use of an automatic pronunciation generation system; and applying said steps sequentially to each element of said corpus to obtain a recognized corpus. - View Dependent Claims (8, 9, 13, 14)
-
-
10. In a computer implemented method for determining an empirically derived acoustic confusability measure, an iterative method for development of a probability model family, comprising the steps of:
-
providing a recognized corpus; establishing a termination condition which depends on one or more of; a number of iterations executed; closeness of match between a previous and current probability family models;
oranother consideration; defining a family of decoding costs; setting an iteration count to 0. setting a phoneme pair count to 0; for each entry in the recognized corpus, performing the following steps; constructing a lattice; populating said lattice arcs with values drawn from a current family of decoding costs; applying a Bellman-Ford dynamic programming algorithm, or a Dijkstra'"'"'s shortest path first algorithm, to find a shortest path through said lattice, from a source node to a terminal node; and traversing said determined shortest path, wherein for each arc that is traversed, the phoneme pair count is incremented by 1. for each transcription, computing a confidence score which is the sum of a phoneme pair value over all transcriptions paired with an utterance; estimating a family of probability models; if the iteration count >
0, testing a termination condition;if said termination condition is satisfied, returning a desired probability model family and stopping; if said termination condition is not satisfied, defining a new family of decoding costs; incrementing said iteration count and repeating. - View Dependent Claims (11, 12)
-
-
15. A method for computing an empirically derived acoustic confusability of two phrases, comprising the steps of:
-
determining a desired probability model family □
;using □
to compute acoustic confusability of two arbitrary phrases w and v by;computing a raw phrase acoustic confusability measure, which is a measure of the acoustic similarity of phrases v and w; and computing a grammar-relative confusion probability measure, which is an estimate of the probability that a grammar-constrained recognizer returns the phrase v as a decoding, when a true phrase is w. - View Dependent Claims (16, 17, 18)
-
Specification