Determining and using acoustic confusability, acoustic perplexity and synthetic acoustic word error rate
First Claim
1. A method comprising the steps of:
- creating an evaluation model from at least one evaluation phone;
creating a synthesizer model from at least one synthesizer phone;
determining a matrix from the evaluation and synthesizer models, said matrix configured for speech recognition;
creating a new matrix by subtracting the matrix from an identity matrixdetermining an inverse of the new matrix; and
determining acoustic confusability by using the inverse of the new matrix.
2 Assignments
0 Petitions
Accused Products
Abstract
Two statistics are disclosed for determining the quality of language models. These statistics are called acoustic perplexity and the synthetic acoustic word error rate (SAWER), and they depend upon methods for computing the acoustic confusability of words. It is possible to substitute models of acoustic data in place of real acoustic data in order to determine acoustic confusability. An evaluation model is created, a synthesizer model is created, and a matrix is determined from the evaluation and synthesizer models. Each of the evaluation and synthesizer models is a hidden Markov model. Once the matrix is determined, a confusability calculation may be performed. Different methods are used to determine synthetic likelihoods. The confusability may be normalized and smoothed and methods are disclosed that increase the speed of performing the matrix inversion and the confusability calculation. A method for caching and reusing computations for similar words is disclosed. Acoustic perplexity and SAWER are determined and applied.
-
Citations
55 Claims
-
1. A method comprising the steps of:
-
creating an evaluation model from at least one evaluation phone; creating a synthesizer model from at least one synthesizer phone; determining a matrix from the evaluation and synthesizer models, said matrix configured for speech recognition; creating a new matrix by subtracting the matrix from an identity matrix determining an inverse of the new matrix; and determining acoustic confusability by using the inverse of the new matrix. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method comprising the steps of:
-
a) creating an evaluation model from a plurality of evaluation phones, each of the phones corresponding to a first word; b) creating a synthesizer model from a plurality of synthesizer phones, each of the phones corresponding to a second word; c) creating a product machine from the evaluation model and synthesizer model, the product machine comprising a plurality of transitions and a plurality of states; d) determining a matrix from the product machine; and e) determining acoustic confusability of the first word and the second word by using the matrix, said matrix configured for speech recognition. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. A method comprising the steps of:
-
a) determining acoustic confusability for each of a plurality of word pairs, wherein step (a) further comprises the steps of, for each of the word pairs; determining an edit distance between each word of the word pair; and determining acoustic confusability from the edit distance; and b) determining a metric for use in speech recognition by using the acoustic confusabilities, wherein step (b) further comprises the step of determining an acoustic perplexity by using the confusabilities. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42)
-
-
43. A method for determining acoustic confusability of a word pair, the method comprising the steps of:
-
determining an edit distance between each word pair and an associated alignment; assigning acoustic distances to each aligned phoneme pair; and determining an acoustic confusability for use in speech recognition by summing said acoustic distances. - View Dependent Claims (44, 45, 46, 47)
-
-
48. An apparatus comprising:
-
a memory that stores computer-readable code; and a processor operatively coupled to said memory, said processor configured to implement said computer-readable code, said computer-readable code configured to; create an evaluation model from at least one evaluation phone; create a synthesizer model from at least one synthesizer phone; determine a matrix from the evaluation and synthesizer models; create a new matrix by subtracting the matrix from an identity matrix determine an inverse of the new matrix; and determine acoustic confusability by using the inverse of the new matrix.
-
-
49. An apparatus comprising:
-
a memory that stores computer-readable code; and a processor operatively coupled to said memory, said processor configured to implement said computer-readable code, said computer-readable code configured to; a) determine acoustic confusability for each of a plurality of word pairs, wherein step (a) further comprises the steps of for each of the word pairs; determining an edit distance between each word of the word pair; and determining acoustic confusability from the edit distance; and b) determine a metric for use in speech recognition by using the acoustic confusabilities, wherein step (b) further comprises the step of determining an acoustic perplexity by using the confusabilities. - View Dependent Claims (50)
-
-
51. An apparatus for determining acoustic confusability of a word pair, the system comprising:
-
a memory that stores computer-readable code; and a processor operatively coupled to said memory, said processor configured to implement said computer-readable code, said computer-readable code configured to; determine an edit distance between each word pair and an associated alignment; assign acoustic distances to each aligned phoneme pair; and determine an acoustic confusability for use in speech recognition by summing said acoustic distances.
-
-
52. An article of manufacture comprising:
-
a computer-readable medium having computer-readable code means embodied thereon, the computer-readable program code means comprising; a step to creating an evaluation model from at least one evaluation phone; a step to creating a synthesizer model from at least one synthesizer phone; a step to determining a matrix from the evaluation and synthesizer models; a step to create a new matrix by subtracting the matrix from an identity matrix a step to determine an inverse of the new matrix; and a step to determine acoustic confusability by using the inverse of the new matrix.
-
-
53. An article of manufacture comprising:
-
a computer-readable medium having computer-readable code means embodied thereon, the computer-readable program code means comprising; a) a step to determine acoustic confusability for each of a plurality of word pairs, wherein step (a) further comprises the steps of, for each of the word pairs; determining an edit distance between each word of the word pair; and determining acoustic confusability from the edit distance; and b) a step to determine a metric for use in speech recognition by using the acoustic confusabilities, wherein step (b) further comprises the step of determining an acoustic perplexity by using the confusabilities. - View Dependent Claims (54)
-
-
55. An article of manufacture for determining acoustic confusability of a word pair, the article of manufacture comprising:
-
a computer-readable medium having computer-readable code means embodied thereon, the computer-readable program code means comprising; determine an edit distance between each word pair and an associated alignment; assign acoustic distances to each aligned phoneme pair; and determine an acoustic confusability for use in speech recognition by summing said acoustic distances.
-
Specification