×

Direction-based speech endpointing

  • US 10,134,425 B1
  • Filed: 06/29/2015
  • Issued: 11/20/2018
  • Est. Priority Date: 06/29/2015
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method for determining an utterance endpoint during automatic speech recognition (ASR) processing, the method comprising:

  • receiving audio comprising speech;

    determining audio data based on the audio;

    determining a source direction corresponding to the audio data;

    determining a duration associated with the audio data, wherein the duration indicates how long the audio has been continuously received from the source direction;

    performing ASR processing on the audio data to determine;

    a plurality of hypotheses, wherein each hypothesis of the plurality of hypotheses includes at least one word or a representation of at least one word potentially corresponding to the audio data, andfor each of the plurality of hypotheses, a respective probability that the respective hypothesis corresponds to an utterance represented in the audio data;

    determining, for each of the plurality of hypotheses, a representation of a respective number of audio frames corresponding to non-speech immediately preceding a first point;

    calculating, for each of the plurality of hypotheses, a respective weighted pause duration by multiplying the respective probability of a respective hypothesis by the respective number of audio frames of the respective hypothesis;

    calculating a cumulative expected pause duration by summing the respective weighted pause durations for each of the plurality of hypotheses;

    calculating an adjusted cumulative score using the cumulative expected pause duration; and

    designating the first point as corresponding to a likely endpoint as a result of the adjusted cumulative score exceeding a first threshold.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×