Adjustable TTS devices

US 9,704,476 B1
Filed: 06/27/2013
Issued: 07/11/2017
Est. Priority Date: 06/27/2013
Status: Active Grant

First Claim

Patent Images

1. A computing device, comprising:

at least one processor;

memory including instructions that, when executed, configure the at least one processor;

to determine a load of a server processing TTS requests;

to receive text data for TTS processing;

to estimate a time of completion for the TTS processing of the text data based at least in part on the determined load;

to determine that the time of completion is greater than a threshold time;

to adjust at least one TTS processing parameter from a first value to a second value based at least in part on the time of completion, wherein the at least one TTS parameter includes a unit database size, a Viterbi beam width, a candidate unit graph size, or an audio sampling rate;

to synthesize speech based on the text data using the second value; and

to transmit audio data comprising the synthesized speech for playback to a user.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

In a distributed text-to-speech (TTS) system, a remote TTS device, such as a TTS server, may experience increased loads of TTS requests, which may result in delayed processing of TTS requests. To avoid such delays, upon indication or prediction of an increased load, a TTS server may adjust unit selection TTS processing by altering unit selection techniques to speed processing, at the expense of potential result quality. Such techniques may include use of a reduced size unit database, a narrow Viterbi beam search, and/or a reduced size candidate unit graph.

Citations

20 Claims

1. A computing device, comprising:
- at least one processor;
  
  memory including instructions that, when executed, configure the at least one processor;
  
  to determine a load of a server processing TTS requests;
  
  to receive text data for TTS processing;
  
  to estimate a time of completion for the TTS processing of the text data based at least in part on the determined load;
  
  to determine that the time of completion is greater than a threshold time;
  
  to adjust at least one TTS processing parameter from a first value to a second value based at least in part on the time of completion, wherein the at least one TTS parameter includes a unit database size, a Viterbi beam width, a candidate unit graph size, or an audio sampling rate;
  
  to synthesize speech based on the text data using the second value; and
  
  to transmit audio data comprising the synthesized speech for playback to a user.
- View Dependent Claims (2, 3, 4)
- - 2. The computing device of claim 1, wherein the at least one processor is further configured to determine the second value based at least in part on the load.
  - 3. The computing device of claim 1, wherein the at least one processor is further configured to adjust the at least one TTS processing parameter by selecting the unit database size from a plurality of pre-determined unit database sizes.
  - 4. The computing device of claim 1, wherein the at least one processor is further configured:
    - to receive second text data for TTS processing;
      
      to synthesize a first portion of the second text data using the first value; and
      
      to synthesize a second portion of the second text data using the second value.

5. A method comprising:
- receiving, by a server, a text-to-speech (TTS) processing request from a local device;
  
  determining, by the server, a number of pending TTS processing requests of a TTS processing device of the server;
  
  estimating a time of completion for the TTS processing request based on the number of pending TTS processing requests;
  
  determining the time of completion is greater than a threshold time;
  
  setting, by the server, a TTS processing parameter to a first value based at least in part on the time of completion being greater than the threshold time, the TTS processing parameter adjusting TTS quality output of the TTS processing device;
  
  processing, by the TTS processing device, the TTS processing request using the first value; and
  
  transmitting, by the server, results of the processing to the local device.
- View Dependent Claims (6, 7, 8, 9, 10, 11, 12)
- - 6. The method of claim 5, wherein the first value comprises one or more of a unit database size, a Viterbi beam width, a candidate unit graph size, or an audio sampling rate.
  - 7. The method of claim 6, further comprising selecting the unit database size from a plurality of pre-determined unit database sizes.
  - 8. The method of claim 5, further comprising:
    - comparing the number of pending TTS requests to a threshold; and
      
      setting the TTS processing parameter to the first value based at least in part on the comparing.
  - 9. The method of claim 5, further comprising:
    - receiving a second TTS processing request;
      
      synthesizing a first portion of the second TTS processing request using a second value for the TTS processing parameter; and
      
      synthesizing a second portion of the second TTS processing request using the first value.
  - 10. The method of claim 5, further comprising:
    - receiving a second TTS processing request;
      
      synthesizing a first portion of the second TTS processing request using a second value for the TTS processing parameter;
      
      restarting synthesis of the second TTS processing request; and
      
      synthesizing the second TTS processing request using the first value.
  - 11. The method of claim 5, further comprising predicting a future number of TTS processing requests of the TTS processing device, and wherein setting the TTS processing parameter to the first value is further based at least in part on the future number of TTS processing requests.
  - 12. The method of claim 5, further comprising instructing a second local device to perform TTS processing on a second TTS processing request based at least in part on the number of pending TTS processing requests.

13. A computing system, comprising:
- at least one processor;
  
  memory including instructions that, when executed, configure the at least one processor to;
  
  receive, by a server, a text-to-speech (TTS) processing request from a local device;
  
  determine, by the server, a number of pending TTS processing requests of a TTS processing device of the server;
  
  estimate a time of completion for the TTS processing request based on the number of pending TTS processing requests;
  
  determine the time of completion is greater than a threshold time;
  
  set, by the server, a TTS processing parameter to a first value based at least in part on the time of completion being greater than the threshold time, the TTS processing parameter adjusting TTS quality output of the TTS processing device;
  
  process, by the TTS processing device, the TTS processing request using the first value; and
  
  transmit, by the server, results of the processing to the local device.
- View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
- - 14. The computing system of claim 13, wherein the first value comprises one or more of a unit database size, a Viterbi beam width, a candidate unit graph size, or an audio sampling rate.
  - 15. The computing system of claim 14, wherein the instructions further configure the at least one processor to select the unit database size from a plurality of pre-determined unit database sizes.
  - 16. The computing system of claim 13, wherein the instructions further configure the at least one processor to:
    - compare the number of pending TTS requests to a threshold; and
      
      set the TTS processing parameter to the first value based at least in part on the comparing.
  - 17. The computing system of claim 13, wherein the instructions further configure the at least one processor to:
    - receive a second TTS processing request;
      
      synthesize a first portion of the second TTS processing request using a second value for the TTS processing parameter; and
      
      synthesize a second portion of the second TTS processing request using the first value.
  - 18. The computing system of claim 13, wherein the instructions further configure the at least one processor to:
    - receive a second TTS processing request;
      
      synthesize a first portion of the second TTS processing request using a second value for the TTS processing parameter;
      
      restart synthesis of the second TTS processing request; and
      
      synthesize the second TTS processing request using the first value.
  - 19. The computing system of claim 13, wherein the instructions further configure the at least one processor to:
    - predict a future number of TTS processing requests of the TTS processing device,wherein the instructions configuring the at least one processor to set the TTS processing parameter to the first value further include instructions to set the TTS processing parameter to the first value based at least in part on the future number of TTS processing requests.
  - 20. The computing system of claim 13, wherein the instructions further configure the at least one processor to instruct a second local device to perform TTS processing on a second TTS processing request based at least in part on the number of pending TTS processing requests.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Original Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Inventors
Swietlinski, Krzysztof Franciszek, Kaszczuk, Michal Tadeusz
Primary Examiner(s)
Jackson, Jakieda

Application Number

US13/929,104
Time in Patent Office

1,475 Days
Field of Search

704260
US Class Current
CPC Class Codes

G10L 13/04 Details of speech synthesis...

Adjustable TTS devices

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Adjustable TTS devices

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links