Voice persona service for embedding text-to-speech features into software programs
First Claim
1. In a computing environment, a system comprising, a service that includes a user interface accessible to clients via a network, a text-to-speech engine, and a data store of user-defined voice personas, a user-defined voice persona specifying one of a plurality of base voices and a plurality of voice morphing parameters associated with the base voice, the service configured to receive definitions of the voice personas from users and store the user-defined voice personas in the store of voice personas, where the users use the user interface to input new voice morphing parameters to modify the morphing parameters of the voice personas, the service configured to obtain via the network a user-provided text-to-speech input script comprised of portions of text comprised of respective voice persona identifiers, each voice persona identifier identifying one of the user-defined voice personas including a voice persona having the voice morphing parameters modified by the new voice morphing parameters inputted through the user interface, and the service converting the text-to-speech input script to a speech waveform via a text-to-speech engine based on the identified user-defined voice personas in the data store of voice personas, where portions of text in the text-to-speech script are converted to speech portions of the speech waveform using the user-defined voice personas identified by the voice persona identifiers, respectively.
2 Assignments
0 Petitions
Accused Products
Abstract
Described is a voice persona service by which users convert text into speech waveforms, based on user-provided parameters and voice data from a service data store. The service may be remotely accessed, such as via the Internet. The user may provide text tagged with parameters, with the text sent to a text-to-speech engine along with base or custom voice data, and the resulting waveform morphed based on the tags. The user may also provide speech. Once created, a voice persona corresponding to the speech waveform may be persisted, exchanged, made public, shared and so forth. In one example, the voice persona service receives user input and parameters, and retrieves a base or custom voice that may be edited by the user via a morphing algorithm. The service outputs a waveform, such as a .wav file for embedding in a software program, and persists the voice persona corresponding to that waveform.
308 Citations
18 Claims
- 1. In a computing environment, a system comprising, a service that includes a user interface accessible to clients via a network, a text-to-speech engine, and a data store of user-defined voice personas, a user-defined voice persona specifying one of a plurality of base voices and a plurality of voice morphing parameters associated with the base voice, the service configured to receive definitions of the voice personas from users and store the user-defined voice personas in the store of voice personas, where the users use the user interface to input new voice morphing parameters to modify the morphing parameters of the voice personas, the service configured to obtain via the network a user-provided text-to-speech input script comprised of portions of text comprised of respective voice persona identifiers, each voice persona identifier identifying one of the user-defined voice personas including a voice persona having the voice morphing parameters modified by the new voice morphing parameters inputted through the user interface, and the service converting the text-to-speech input script to a speech waveform via a text-to-speech engine based on the identified user-defined voice personas in the data store of voice personas, where portions of text in the text-to-speech script are converted to speech portions of the speech waveform using the user-defined voice personas identified by the voice persona identifiers, respectively.
-
7. A computer-readable storage medium having computer-executable instructions, which when executed perform steps, comprising:
-
storing a plurality of voice personas in a data store, each voice persona comprising a base voice and voice morphing parameters, the voice personas accessible to clients from a voice persona service via a network; receiving at the voice persona service, via the network, user input identifying one of the stored voice personas and the user input comprising voice morphing parameters;
retrieving the base voice and the voice morphing parameters of the voice persona identified by the user input;modifying the retrieved voice morphing parameters of the voice persona based on the received voice morphing parameters inputted by the user; saving the modified voice persona in the data store as a new voice persona; and receiving text from a user via the network at the voice persona service, retrieving the new voice persona and outputting a waveform corresponding to the voice persona by performing text-to-speech conversion and speech morphing using the modified morphing parameters. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer-implemented method for a network service allowing users to create and use voice personas in a text-to-speech system, the method comprising:
-
maintaining a database of voice persona records, each voice persona record specifying an identifier of a voice persona, a base voice of the voice persona, and a plurality of voice morphing parameters of the voice persona; receiving from clients, via a network, specifications for voice persona records, the specifications comprising voice morphing parameters inputted by users, and in response modifying or creating voice persona records in the database that have the voice morphing parameters by modifying the voice persona records with the voice morphing parameters inputted by the users; receiving from clients, via the network, text-to-speech scripts, a text-to-speech script comprising portions of text and identifiers identifying voice personas that have the voice morphing parameters received from the clients, and in response; using the identifiers to retrieve corresponding voice persona records identified by the identifiers, for each retrieved voice persona record, given such a retrieved voice persona record, performing text-to-speech conversion on a corresponding portion of text in the text-to-speech script using the base voice specified by the given voice and morphing the base voice according to the voice morphing parameters specified by the given voice persona record, the conversions of the portions together producing an audio speech data unit comprised of portions of audio speech data of the text portions in voice according to the respective voice persona records. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification