DISTRIBUTED SPEECH UNIT INVENTORY FOR TTS SYSTEMS
First Claim
Patent Images
1. A computing device for performing text-to-speech (TTS) processing, comprising:
- at least one processor;
a memory device including instructions operable to be executed by the at least one processor to perform a set of actions, configuring the at least one processor;
to access a local database of speech units to be used in unit selection speech synthesis, wherein the local database is comprised from a larger database of speech units;
to receive text data for TTS processing;
to determine desired speech units to synthesize the received text data;
to identify first desired speech units in the local database;
to identify second desired speech units in the larger database located at a remote device;
to concatenate audio segments corresponding to the first desired speech units in the local database and audio segments corresponding to the second desired speech units; and
to output audio data comprising speech corresponding to the received text data.
2 Assignments
0 Petitions
Accused Products
Abstract
In a text-to-speech (TTS) system, a database including sample speech units for unit selection may be configured for use by a local device. The local unit database may be created from a more comprehensive unit database. The local unit database may include units which provide sufficient TTS results for frequently input text. Speech synthesis may then be performed by concatenating locally available units with units from a remote device including the comprehensive unit database. Aspects of the speech synthesis may be performed by the remote device and/or the local device.
31 Citations
25 Claims
-
1. A computing device for performing text-to-speech (TTS) processing, comprising:
-
at least one processor; a memory device including instructions operable to be executed by the at least one processor to perform a set of actions, configuring the at least one processor; to access a local database of speech units to be used in unit selection speech synthesis, wherein the local database is comprised from a larger database of speech units; to receive text data for TTS processing; to determine desired speech units to synthesize the received text data; to identify first desired speech units in the local database; to identify second desired speech units in the larger database located at a remote device; to concatenate audio segments corresponding to the first desired speech units in the local database and audio segments corresponding to the second desired speech units; and to output audio data comprising speech corresponding to the received text data. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method comprising:
-
receiving text data for text-to-speech processing; receiving first audio segments corresponding to first desired speech units from a remote database; receiving second audio segments corresponding to second desired speech units from a local database; and creating audio corresponding to the received text data using the first audio segments and second audio segments. - View Dependent Claims (7, 8, 9, 10, 11, 12, 13)
-
-
14. A computing device, comprising:
-
at least one processor; a memory device including instructions operable to be executed by the at least one processor to perform a set of actions, configuring the at least one processor; to receive text data for text-to-speech processing; to identify first desired speech units in a remote database for use in synthesizing the received text data; to identify second desired speech units in a local database for use in synthesizing the received text data; to send first audio segments corresponding to first desired speech units to a local device comprising the local database; and to send instructions to a local device to concatenate the first audio segments with second audio segments corresponding to second desired speech units stored at the local device. - View Dependent Claims (15, 16, 17, 18, 19)
-
-
20. A non-transitory computer-readable storage medium storing processor-executable instructions for controlling a computing device, comprising:
-
program code to receive text data for text-to-speech processing; program code to identify first desired speech units in a remote database for use in synthesizing the received text data; program code to identify second desired speech units in a local database for use in synthesizing the received text data; program code to send first audio segments corresponding to first desired speech units to a local device comprising the local database; and program code to send instructions to a local device to concatenate the first audio segments with second audio segments corresponding to second desired speech units stored at the local device. - View Dependent Claims (21, 22, 23, 24, 25)
-
Specification