SINGLE INTERFACE FOR LOCAL AND REMOTE SPEECH SYNTHESIS
1 Assignment
0 Petitions
Accused Products
Abstract
Features are disclosed for providing a consistent interface for local and distributed text to speech (TTS) systems. Some portions of the TTS system, such as voices and TTS engine components, may be installed on a client device, and some may be present on a remote system accessible via a network link. Determinations can be made regarding which TTS system components to implement on the client device and which to implement on the remote server. The consistent interface facilitates connecting to or otherwise employing the TTS system through use of the same methods and techniques regardless of the which TTS system configuration is implemented.
8 Citations
50 Claims
-
1-30. -30. (canceled)
-
31. A system comprising:
-
a computer-readable memory storing executable instructions; and one or more computer processors in communication with the computer-readable memory, wherein the one or more computer processors are programmed by the executable instructions to at least; receive, from a remote storage location, voice recordings of subword units; generate a text-to-speech presentation by concatenating two or more of the voice recordings, wherein individual voice recordings of the two or more voice recordings correspond to subword units for individual words in a text to be presented audibly; determine system performance in generating the text-to-speech presentation; determine, based at least partly on the system performance, that accessing the voice recordings at a local storage location will likely improve system performance in generating a subsequent text-to-speech presentation; store at least the portion of the voice recordings in the local storage location; access at least the portion of the voice recordings at the local storage location; and generate the subsequent text-to-speech presentation using the portion of voice recordings accessed at the local storage location. - View Dependent Claims (32, 33, 34)
-
-
35. A computer-implemented method comprising:
as implemented by one or more computing devices configured to execute specific instructions, accessing first voice data at a current storage location; generating a plurality of text-to-speech presentations using the first voice data accessed at the current storage location; generating usage data regarding generation of the plurality of text-to-speech presentations; determining a preferred storage location for the first voice data based at least partly on the usage data, wherein the preferred storage location corresponds to one of a local storage location or a remote storage location, and wherein the preferred storage location is different than the current storage location; accessing first voice data at the preferred storage location; and generating a subsequent text-to-speech presentation using the first voice data accessed at the preferred storage location. - View Dependent Claims (36, 37, 38, 39, 40, 41, 42)
-
43. A non-transitory computer storage medium which stores an executable code module that directs a client computing device to perform a process comprising:
-
accessing first voice data at a current storage location; generating a plurality of text-to-speech presentations using the first voice data accessed at the current storage location; generating usage data regarding generation of the plurality of text-to-speech presentations; determining a preferred storage location for the first voice data based at least partly on the usage data, wherein the preferred storage location corresponds to one of a local storage location or a remote storage location, and wherein the preferred storage location is different than the current storage location; accessing first voice data at the preferred storage location; and generating a subsequent text-to-speech presentation using the first voice data accessed at the preferred storage location. - View Dependent Claims (44, 45, 46, 47, 48, 49, 50)
-
Specification