Adjustable TTS devices
First Claim
Patent Images
1. A computing device, comprising:
- at least one processor;
memory including instructions that, when executed, configure the at least one processor;
to determine a load of a server processing TTS requests;
to receive text data for TTS processing;
to estimate a time of completion for the TTS processing of the text data based at least in part on the determined load;
to determine that the time of completion is greater than a threshold time;
to adjust at least one TTS processing parameter from a first value to a second value based at least in part on the time of completion, wherein the at least one TTS parameter includes a unit database size, a Viterbi beam width, a candidate unit graph size, or an audio sampling rate;
to synthesize speech based on the text data using the second value; and
to transmit audio data comprising the synthesized speech for playback to a user.
1 Assignment
0 Petitions
Accused Products
Abstract
In a distributed text-to-speech (TTS) system, a remote TTS device, such as a TTS server, may experience increased loads of TTS requests, which may result in delayed processing of TTS requests. To avoid such delays, upon indication or prediction of an increased load, a TTS server may adjust unit selection TTS processing by altering unit selection techniques to speed processing, at the expense of potential result quality. Such techniques may include use of a reduced size unit database, a narrow Viterbi beam search, and/or a reduced size candidate unit graph.
-
Citations
20 Claims
-
1. A computing device, comprising:
-
at least one processor; memory including instructions that, when executed, configure the at least one processor; to determine a load of a server processing TTS requests; to receive text data for TTS processing; to estimate a time of completion for the TTS processing of the text data based at least in part on the determined load; to determine that the time of completion is greater than a threshold time; to adjust at least one TTS processing parameter from a first value to a second value based at least in part on the time of completion, wherein the at least one TTS parameter includes a unit database size, a Viterbi beam width, a candidate unit graph size, or an audio sampling rate; to synthesize speech based on the text data using the second value; and to transmit audio data comprising the synthesized speech for playback to a user. - View Dependent Claims (2, 3, 4)
-
-
5. A method comprising:
-
receiving, by a server, a text-to-speech (TTS) processing request from a local device; determining, by the server, a number of pending TTS processing requests of a TTS processing device of the server; estimating a time of completion for the TTS processing request based on the number of pending TTS processing requests; determining the time of completion is greater than a threshold time; setting, by the server, a TTS processing parameter to a first value based at least in part on the time of completion being greater than the threshold time, the TTS processing parameter adjusting TTS quality output of the TTS processing device; processing, by the TTS processing device, the TTS processing request using the first value; and transmitting, by the server, results of the processing to the local device. - View Dependent Claims (6, 7, 8, 9, 10, 11, 12)
-
-
13. A computing system, comprising:
-
at least one processor; memory including instructions that, when executed, configure the at least one processor to; receive, by a server, a text-to-speech (TTS) processing request from a local device; determine, by the server, a number of pending TTS processing requests of a TTS processing device of the server; estimate a time of completion for the TTS processing request based on the number of pending TTS processing requests; determine the time of completion is greater than a threshold time; set, by the server, a TTS processing parameter to a first value based at least in part on the time of completion being greater than the threshold time, the TTS processing parameter adjusting TTS quality output of the TTS processing device; process, by the TTS processing device, the TTS processing request using the first value; and transmit, by the server, results of the processing to the local device. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification