Phonetic distance measurement system and related methods
First Claim
Patent Images
1. A method of generating a phonetic distance matrix comprising:
- determining, for each of a plurality of phonemes occurring in the reference file, a plurality of phoneme error occurrences by comparing a recognized speech file with a reference file, the recognized speech file generated by processing at least one audio file of recorded speech with a speech recognition engine, the reference file representing the actual contents of the recorded speech;
determining, for each of the plurality of phonemes occurring in the reference file, a plurality of phoneme error rates corresponding to the plurality of phoneme error occurrences;
generating a plurality of phonetic distances as a function of the plurality of phoneme error rates, the plurality of phonetic distances being inversely proportional to the plurality of phoneme error rates; and
outputting a phonetic distance matrix based on the generated plurality of phonetic distances, the phonetic distance matrix including generated phonetic distances between each of the plurality of phonemes;
wherein generating the plurality of phonetic distances and outputting the phonetic distance matrix includes normalizing the generated phonetic distances to minimize a total separation between the outputted phonetic distance matrix and an existing phonetic distance matrix not generated based on the recognized speech file.
2 Assignments
0 Petitions
Accused Products
Abstract
Phonetic distances are empirically measured as a function of speech recognition engine recognition error rates. The error rates are determined by comparing a recognized speech file with a reference file. The phonetic distances can be normalized to earlier measurements. The phonetic distances/error rates can also be used to improve speech recognition engine grammar selection, as an aid in language training and evaluation, and in other applications.
-
Citations
15 Claims
-
1. A method of generating a phonetic distance matrix comprising:
-
determining, for each of a plurality of phonemes occurring in the reference file, a plurality of phoneme error occurrences by comparing a recognized speech file with a reference file, the recognized speech file generated by processing at least one audio file of recorded speech with a speech recognition engine, the reference file representing the actual contents of the recorded speech; determining, for each of the plurality of phonemes occurring in the reference file, a plurality of phoneme error rates corresponding to the plurality of phoneme error occurrences; generating a plurality of phonetic distances as a function of the plurality of phoneme error rates, the plurality of phonetic distances being inversely proportional to the plurality of phoneme error rates; and outputting a phonetic distance matrix based on the generated plurality of phonetic distances, the phonetic distance matrix including generated phonetic distances between each of the plurality of phonemes; wherein generating the plurality of phonetic distances and outputting the phonetic distance matrix includes normalizing the generated phonetic distances to minimize a total separation between the outputted phonetic distance matrix and an existing phonetic distance matrix not generated based on the recognized speech file. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A phonetic distance measurement system comprising:
-
a reference file; a recognized speech file generated by processing an audio file of a speaker reading contents of the reference file; a comparison module configured to determine, for each of a plurality of phonemes occurring in the reference file, a plurality of phoneme error occurrences by comparing the recognized speech file and the reference file; an error rate module configured to determine, for each of the plurality of phonemes, a plurality of phoneme error rates corresponding to the plurality of phoneme error occurrences; and a measurement module configured to generate a plurality of phonetic distances between each of the plurality of phonemes as a function of the plurality of phoneme error rates, the plurality of phonetic distances being inversely proportional to the plurality of phoneme error rates; wherein the measurement module is further configured to normalize the phonetic distances to an existing matrix of phonetic distances not generated based on the recognized speech file. - View Dependent Claims (13, 14, 15)
-
Specification