Phonetic distance measurement system and related methods

US 9,659,559 B2
Filed: 06/25/2009
Issued: 05/23/2017
Est. Priority Date: 06/25/2009
Status: Active Grant

First Claim

Patent Images

1. A method of generating a phonetic distance matrix comprising:

determining, for each of a plurality of phonemes occurring in the reference file, a plurality of phoneme error occurrences by comparing a recognized speech file with a reference file, the recognized speech file generated by processing at least one audio file of recorded speech with a speech recognition engine, the reference file representing the actual contents of the recorded speech;

determining, for each of the plurality of phonemes occurring in the reference file, a plurality of phoneme error rates corresponding to the plurality of phoneme error occurrences;

generating a plurality of phonetic distances as a function of the plurality of phoneme error rates, the plurality of phonetic distances being inversely proportional to the plurality of phoneme error rates; and

outputting a phonetic distance matrix based on the generated plurality of phonetic distances, the phonetic distance matrix including generated phonetic distances between each of the plurality of phonemes;

wherein generating the plurality of phonetic distances and outputting the phonetic distance matrix includes normalizing the generated phonetic distances to minimize a total separation between the outputted phonetic distance matrix and an existing phonetic distance matrix not generated based on the recognized speech file.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Phonetic distances are empirically measured as a function of speech recognition engine recognition error rates. The error rates are determined by comparing a recognized speech file with a reference file. The phonetic distances can be normalized to earlier measurements. The phonetic distances/error rates can also be used to improve speech recognition engine grammar selection, as an aid in language training and evaluation, and in other applications.

Citations

15 Claims

1. A method of generating a phonetic distance matrix comprising:
- determining, for each of a plurality of phonemes occurring in the reference file, a plurality of phoneme error occurrences by comparing a recognized speech file with a reference file, the recognized speech file generated by processing at least one audio file of recorded speech with a speech recognition engine, the reference file representing the actual contents of the recorded speech;
  
  determining, for each of the plurality of phonemes occurring in the reference file, a plurality of phoneme error rates corresponding to the plurality of phoneme error occurrences;
  
  generating a plurality of phonetic distances as a function of the plurality of phoneme error rates, the plurality of phonetic distances being inversely proportional to the plurality of phoneme error rates; and
  
  outputting a phonetic distance matrix based on the generated plurality of phonetic distances, the phonetic distance matrix including generated phonetic distances between each of the plurality of phonemes;
  
  wherein generating the plurality of phonetic distances and outputting the phonetic distance matrix includes normalizing the generated phonetic distances to minimize a total separation between the outputted phonetic distance matrix and an existing phonetic distance matrix not generated based on the recognized speech file.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method of claim 1, wherein the plurality of phoneme error occurrences includes a plurality of phoneme substitution, insertion and deletion error occurrences, the plurality of phoneme error rates includes a corresponding plurality of phoneme substitution, insertion and deletion error rates, and the generated plurality of phonetic distances further includes phonetic distances between each of the plurality of phonemes and insertion and deletion.
  - 3. The method of claim 1, wherein determining the plurality of phoneme error rates includes dividing the plurality of error phoneme error occurrences by a total number phoneme occurrences in the reference file.
  - 4. The method of claim 1, wherein normalizing the phonetic distances to minimize the total separation between the phonetic distance matrix and an existing phonetic distance matrix includes using a mapping function with three normalization coefficients.
  - 5. The method of claim 4, wherein the mapping function is:
    - Phonetic Distance_i,j=α
      
      ₁+(α
      
      ₂/(Error Rate_i,j−
      
      α
      
      ₃));
      
      wherein i and j are indices of the phonemes, and α
      
      ₁, α
      
      ₂and α
      
      ₃are the three normalization coefficients.
  - 6. The method of claim 5, wherein the separation between the phonetic distance matrix and the existing phonetic distance matrix is defined as:
    - L(α
      
      ₁,α
      
      ₂,α
      
      ₃)=Σ
      
      _i,j(Existing Phonetic Distance_i,j−
      
      Phonetic Distance_i,j)².
  - 7. The method of claim 1, further comprising generating the recognized speech file by processing, with a speech recognition engine, an audio file of a speaker reading contents of the reference file.
  - 8. The method of claim 7, further comprising generating the audio file.
  - 9. The method of claim 1, wherein determining a plurality of phoneme error occurrences includes comparing a plurality of recognized speech and reference files.
  - 10. The method of claim 9, wherein the plurality of recognized speech files correspond to audio files of a plurality of different speakers.
  - 11. The method of claim 9, wherein the plurality of recognized speech files are generated by a plurality of different speech recognition engines.

12. A phonetic distance measurement system comprising:
- a reference file;
  
  a recognized speech file generated by processing an audio file of a speaker reading contents of the reference file;
  
  a comparison module configured to determine, for each of a plurality of phonemes occurring in the reference file, a plurality of phoneme error occurrences by comparing the recognized speech file and the reference file;
  
  an error rate module configured to determine, for each of the plurality of phonemes, a plurality of phoneme error rates corresponding to the plurality of phoneme error occurrences; and
  
  a measurement module configured to generate a plurality of phonetic distances between each of the plurality of phonemes as a function of the plurality of phoneme error rates, the plurality of phonetic distances being inversely proportional to the plurality of phoneme error rates;
  
  wherein the measurement module is further configured to normalize the phonetic distances to an existing matrix of phonetic distances not generated based on the recognized speech file.
- View Dependent Claims (13, 14, 15)
- - 13. The system of claim 12, further comprising a dictionary, wherein the comparison module is further configured to access the dictionary to identify phonemes in the reference file and recognized speech file prior to determining the plurality of phoneme error occurrences.
  - 14. The system of claim 12, wherein the plurality of phoneme error occurrences the comparison module is configured to determine include phoneme substitution error occurrences, phoneme insertion error occurrences and phoneme deletion error occurrences.
  - 15. The system of claim 14, wherein the comparison module is further configured to identify the plurality of phoneme substitution error occurrences by corresponding pairs of phonemes and the phoneme insertion and deletion error occurrences by individual corresponding phonemes.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Adacel Systems, Inc. (Adacel Technologies Ltd.)
Original Assignee
Adacel Systems, Inc. (Adacel Technologies Ltd.)
Inventors
Shu, Chang-Qing
Primary Examiner(s)
BAKER, MATTHEW H

Application Number

US12/491,769
Publication Number

US 20100332230A1
Time in Patent Office

2,889 Days
Field of Search

704231-257
US Class Current
CPC Class Codes

G06F 40/56   Natural language generation

G10L 15/01   Assessment or evaluation of...

G10L 15/063   Training

G10L 15/142   Hidden Markov Models [HMMs]

G10L 15/144   Training of HMMs

G10L 15/187   Phonemic context, e.g. pron...

G10L 15/22   Procedures used during a sp...

G10L 2015/025   Phonemes, fenemes or fenone...

G10L 2015/221   Announcement of recognition...

Phonetic distance measurement system and related methods

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Phonetic distance measurement system and related methods

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links