×

Duration ratio modeling for improved speech recognition

  • US 9,542,939 B1
  • Filed: 08/31/2012
  • Issued: 01/10/2017
  • Est. Priority Date: 08/31/2012
  • Status: Active Grant
First Claim
Patent Images

1. A method of recognizing speech in audio data, the method comprising:

  • receiving audio data representing speech, wherein the receiving is performed by an automated speech recognition (ASR) device configured to convert the audio data to text data, the ASR device comprising an ASR module;

    transforming, by the ASR module, the audio data into one or more feature vectors representing the speech;

    identifying, by the ASR module and using a portion of the one or more feature vectors, a sequence of phonemes represented in a portion of the audio data;

    determining, by the ASR module, a first duration of the sequence of phonemes;

    determining, by the ASR module, a second duration of a single phoneme within the sequence of phonemes;

    determining, by the ASR module, a duration score of the single phoneme, wherein the duration score is determined using the second duration in relation to the first duration;

    determining, by the ASR module, a recognition score based at least in part on the duration score;

    determining, by the ASR module, a speech recognition result based at least in part upon the recognition score, wherein the speech recognition result is the text data corresponding to the speech; and

    causing a command to be executed using the text data.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×