Distributed speech synthesis system, terminal device, and computer program thereof
First Claim
1. A terminal device which can connect to a processing server via a network, said terminal device comprising:
- a unit of receiving from said processing server a secondary content furnished with information for access to a speech database and retrieval of optimal units selected by analyzing text data included in a primary content distributed via said network; and
a unit of synthesizing speech corresponding to said text data, based on said secondary content and the speech database.
1 Assignment
0 Petitions
Accused Products
Abstract
In the text-to-speech synthesis technique for synthesizing speech from text, this invention enables a terminal device with relatively small computing power to perform speech synthesis based on optimal unit selection. The text-to-speech synthesis procedure of the present invention involves content generation and output; that is, a secondary content including the results of the optimal unit selection process is output. By virtue of the secondary content, a high load process of selecting optimal units and a light load process of synthesizing speech waveforms can be performed separately. The optimal unit selection process is performed at a server and information for the units to be retrieved from a corpus is sent to the terminal as data for speech synthesis.
25 Citations
20 Claims
-
1. A terminal device which can connect to a processing server via a network, said terminal device comprising:
-
a unit of receiving from said processing server a secondary content furnished with information for access to a speech database and retrieval of optimal units selected by analyzing text data included in a primary content distributed via said network; and
a unit of synthesizing speech corresponding to said text data, based on said secondary content and the speech database. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A distributed speech synthesis system which includes a processing server and a terminal device connected to said processing server via a network, wherein said system implements speech synthesis and outputs speech from text data included in a primary content received over said network,
wherein said processing server comprises: -
a unit of generating a secondary content, which comprises analyzing the text data included in the primary content received over said network, selecting optimal units, and furnishing information for access to a speech database and retrieval of the optimal units; and
a unit of sending the secondary content to said terminal device. - View Dependent Claims (7, 8)
-
-
9. A computer program for speech synthesis and output from requested content data at a terminal device connected to a processing server via a network, said computer program causing a computer to implement:
-
a function of requesting said processing server for a primary content to be vocalized;
a function of receiving a secondary content including information of a string of optimal units selected by analyzing text data from said primary content from said processing server; and
a function of synthesizing speech from the secondary content data by accessing a speech database. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A computer program for distributed speech synthesis, which synthesizes and outputs speech from text data included in a primary content received over said network, in a distributed speech synthesis system including a processing server and a terminal device connected to said processing server via a network,
wherein respective speech databases exist on said processing server and said terminal device, applying a common identification scheme in which a particular waveform can be identified uniquely, said computer program. causing a computer to implement: -
a function of generating a secondary content, which comprises analyzing the text data included in the primary content received over said network, selecting optimal units, and furnishing information for access to a speech database and retrieval of the optimal units; and
a function of synthesizing speech corresponding to said text data, based on said secondary content and the appropriate speech database. - View Dependent Claims (18, 19, 20)
-
Specification