SPEECH SYNTHESIZER
First Claim
1. A speech synthesis system that generates synthetic speech which conforms to phonetic symbols and prosody information, said speech synthesis system comprising a generation terminal, a server, and a reception terminal that are connected to each other via a computer network,said generation terminal including:
- a small database holding pieces of synthetic speech generation data used for generating synthetic speech; and
a synthetic speech generation data selection unit configured to select, from said small database, pieces of synthetic speech generation data from which synthetic speech that best conforms to the phonetic symbols and the prosody information is to be generated,said server includinga large database holding speech elements which are greater in number than the pieces of synthetic speech generation data held in said small database and from which synthetic speech that can represent more detailed prosody information than the pieces of synthetic speech generation data held in said small database is to be generated, andsaid reception terminal including;
a conforming speech element selection unit configured to select, from said large database, speech elements which correspond to the pieces of synthetic speech generation data selected by said synthetic speech generation data selection unit and from which synthetic speech that best conforms to the phonetic symbols and the prosody information is to be generated; and
a speech element concatenation unit configured to generate synthetic speech by concatenating the speech elements selected by said conforming speech element selection unit.
1 Assignment
0 Petitions
Accused Products
Abstract
A speech synthesizer can execute speech content editing at high speed and generate speech content easily. The speech synthesizer includes a small speech element DB (101), a small speech element selection unit (102), a small speech element concatenation unit (103), a prosody modification unit (104), a large speech element DB (105), a correspondence DB (106) that associates the small speech element DB (101) with the large speech element DB (105), a speech element candidate obtainment unit (107), a large speech element selection unit (108), and a large speech element concatenation unit (109). By editing synthetic speech using the small speech element DB (101) and performing quality enhancement on an editing result using the large speech element DB (105), speech content can be generated easily on a mobile terminal.
26 Citations
13 Claims
-
1. A speech synthesis system that generates synthetic speech which conforms to phonetic symbols and prosody information, said speech synthesis system comprising a generation terminal, a server, and a reception terminal that are connected to each other via a computer network,
said generation terminal including: -
a small database holding pieces of synthetic speech generation data used for generating synthetic speech; and a synthetic speech generation data selection unit configured to select, from said small database, pieces of synthetic speech generation data from which synthetic speech that best conforms to the phonetic symbols and the prosody information is to be generated, said server including a large database holding speech elements which are greater in number than the pieces of synthetic speech generation data held in said small database and from which synthetic speech that can represent more detailed prosody information than the pieces of synthetic speech generation data held in said small database is to be generated, and said reception terminal including; a conforming speech element selection unit configured to select, from said large database, speech elements which correspond to the pieces of synthetic speech generation data selected by said synthetic speech generation data selection unit and from which synthetic speech that best conforms to the phonetic symbols and the prosody information is to be generated; and a speech element concatenation unit configured to generate synthetic speech by concatenating the speech elements selected by said conforming speech element selection unit.
-
-
2. A generation terminal that generates simple synthetic speech which conforms to phonetic symbols and prosody information, said generation terminal comprising:
-
a small database holding speech elements used for generating synthetic speech; a synthetic speech generation data selection unit configured to select, from said small database, pieces of synthetic speech generation data from which synthetic speech that conforms to the phonetic symbols and the prosody information is to be generated; and a transmission unit configured to transmit the pieces of synthetic speech generation data, wherein said transmission unit is configured to transmit, to a server that includes a large database holding speech elements which are greater in number than the speech elements held in said small database, the pieces of synthetic speech generation data to be associated with speech elements in the large database. - View Dependent Claims (3)
-
-
4. A server that generates synthetic speech which conforms to phonetic symbols and prosody information, said server comprising:
-
a reception unit configured to receive pieces of synthetic speech generation data generated by a generation terminal; a large database holding speech elements which are greater in number than pieces of synthetic speech generation data held in a small database; and a correspondence database holding correspondence information that shows a relation between each piece of synthetic speech generation data held in the small database and one or more speech elements corresponding to the piece of synthetic speech generation data.
-
-
5. A speech synthesizer that generates synthetic speech which conforms to phonetic symbols and prosody information, said speech synthesizer comprising:
-
a small database holding pieces of synthetic speech generation data used for generating synthetic speech; a large database holding speech elements which are greater in number than the pieces of synthetic speech generation data held in said small database; a synthetic speech generation data selection unit configured to select, from said small database, pieces of synthetic speech generation data from which synthetic speech that conforms to the phonetic symbols and the prosody information is to be generated; a conforming speech element selection unit configured to select, from said large database, speech elements which correspond to the pieces of synthetic speech generation data selected by said synthetic speech generation data selection unit; and a speech element concatenation unit configured to generate synthetic speech by concatenating the speech elements selected by said conforming speech element selection unit. - View Dependent Claims (6, 7, 8, 9, 10, 11)
-
-
12. A speech synthesis method for generating synthetic speech which conforms to phonetic symbols and prosody information, said speech synthesis method comprising:
-
selecting, from a small database holding pieces of synthetic speech generation data used for generating synthetic speech, pieces of synthetic speech generation data from which synthetic speech that best conforms to the phonetic symbols and the prosody information is to be generated; selecting, from a large database holding speech elements which are greater in number than the pieces of synthetic speech generation data held in the small database and from which synthetic speech that can represent more detailed prosody information than the pieces of synthetic speech generation data held in the small database is to be generated, speech elements which correspond to the pieces of synthetic speech generation data selected in said selecting pieces of synthetic speech generation data and from which synthetic speech that best conforms to the phonetic symbols and the prosody information is to be generated; and generating synthetic speech by concatenating the speech elements selected in said selecting speech elements.
-
-
13. A program for generating synthetic speech which conforms to phonetic symbols and prosody information, said program causing a computer to execute:
-
selecting, from a small database holding pieces of synthetic speech generation data used for generating synthetic speech, pieces of synthetic speech generation data from which synthetic speech that best conforms to the phonetic symbols and the prosody information is to be generated; selecting, from a large database holding speech elements which are greater in number than the pieces of synthetic speech generation data held in the small database and from which synthetic speech that can represent more detailed prosody information than the pieces of synthetic speech generation data held in the small database is to be generated, speech elements which correspond to the pieces of synthetic speech generation data selected in said selecting pieces of synthetic speech generation data and from which synthetic speech that best conforms to the phonetic symbols and the prosody information is to be generated; and generating synthetic speech by concatenating the speech elements selected in said selecting speech elements.
-
Specification