×

Text-to-speech processing with emphasized output audio

  • US 10,319,365 B1
  • Filed: 06/27/2016
  • Issued: 06/11/2019
  • Est. Priority Date: 06/27/2016
  • Status: Active Grant
First Claim
Patent Images

1. A computer implemented method comprising:

  • receiving, from a first speech-controlled device, first input audio data corresponding to a command to receive audio data;

    performing automatic speech recognition on the audio data to generate first text;

    determining a duration corresponding to how long at least one word is pronounced in the first input audio data;

    determining, based on the duration, a first portion of the audio data corresponding to a first word of the first text has a volume greater than a second portion of the audio data corresponding to other words in the first text;

    associating a first speech synthesis markup language (SSML) tag with the first word, the SSML tag indicating the first word is to be emphasized;

    performing text-to-speech (TTS) processing on the first text, using the first SSML tag, to create output audio data, the output audio data including emphasized speech corresponding to the first word; and

    sending, to a second speech-controlled device, the output speech audio data.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×