Method and apparatus for performing text-to-speech conversion in a client/server environment
First Claim
1. A method for performing text-to-speech conversion comprising the steps of:
- analyzing input text and producing therefrom an intermediate representation thereof; and
synthesizing speech output based upon said intermediate representation of said input text, wherein said analyzing and producing step is performed on a server within a client/server environment, and wherein said synthesizing step is performed on a client device which is associated with but distinct from said server.
3 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for performing text-to-speech conversion in a client/server environment partitions an otherwise conventional text-to-speech conversion algorithm into two portions: a first “text analysis” portion, which generates from an original input text an intermediate representation thereof; and a second “speech synthesis” portion, which synthesizes speech waveforms from the intermediate representation generated by the first portion (i.e., the text analysis portion). The text analysis portion of the algorithm is executed exclusively on a server while the speech synthesis portion is executed exclusively on a client which may be associated therewith. The client may comprise a hand-held device such as, for example, a cell phone, and the intermediate representation of the input text advantageously comprises at least a sequence of phonemes representative of the input text. Certain audio segment information which is to be used by the speech synthesis portion of the text-to-speech process may be advantageously transmitted by the server to the client, and a cache of such audio segments may then be advantageously maintained at the client (e.g., in the cell phone) for use by the speech synthesis process in order to obtain improved quality of the synthesized speech.
-
Citations
74 Claims
-
1. A method for performing text-to-speech conversion comprising the steps of:
-
analyzing input text and producing therefrom an intermediate representation thereof; and
synthesizing speech output based upon said intermediate representation of said input text, wherein said analyzing and producing step is performed on a server within a client/server environment, and wherein said synthesizing step is performed on a client device which is associated with but distinct from said server. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50)
-
-
14. A method for performing a first portion of a text-to-speech conversion process, the method executed on a server within a client/server environment and comprising the steps of:
-
analyzing input text and producing therefrom an intermediate representation thereof; and
providing said intermediate representation of said input text for use by a second portion of said text-to-speech conversion process which is to be executed on a client device associated with but distinct from said server, said method not comprising any synthesis of speech output.
-
-
26. A method for performing a second portion of a text-to-speech conversion process, the method executed on a client device within a client/server environment and comprising the step of synthesizing speech output based upon an intermediate representation of input text, said intermediate representation of said input text having been produced by a first portion of said text-to-speech conversion process executed on a server which is associated with but distinct from said client device.
-
38. A system for performing text-to-speech conversion comprising:
-
a text analysis module which analyzes input text and produces therefrom an intermediate representation thereof; and
a speech synthesis module which synthesizes speech output based upon said intermediate representation of said input text, wherein said text analysis module resides on a server within a client/server environment, and wherein said speech synthesis module resides on a client device which is associated with but distinct from said server.
-
-
51. A server within a client/server environment which performs a first portion of a text-to-speech conversion process, the server comprising:
-
a text analysis module which analyzes input text and produces therefrom an intermediate representation thereof; and
means for providing said intermediate representation of said input text for use by a second portion of said text-to-speech conversion process which is to be executed on a client device associated with but distinct from said server, said server not performing any synthesis of speech output. - View Dependent Claims (52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62)
-
- 63. A client device within a client/server environment which performs a second portion of a text-to-speech conversion process, the client device comprising a speech synthesis module which synthesizes speech output based upon an intermediate representation of input text, said intermediate representation of said input text having been produced by a first portion of said text-to-speech conversion process executed on a server which is associated with but distinct from said client device.
Specification