Synthesis-based pre-selection of suitable units for concatenative speech
First Claim
1. A method of synthesizing speech from text using a triphone unit selection database, the method comprising:
- receiving input text;
selecting a plurality of N phoneme units from the triphone unit selection database as candidate phonemes for synthesized speech based on the input text;
applying a cost process to select a set of phonemes from the candidate phonemes; and
synthesizing speech using the selected set of phonemes.
6 Assignments
0 Petitions
Accused Products
Abstract
A method for generating concatenative speech uses a speech synthesis input to populate a triphone-indexed database that is later used for searching and retrieval to create a phoneme string acceptable for a text-to-speech operation. Prior to initiating the “real time” synthesis process, a database is created of all possible triphone contexts by inputting a continuous stream of speech. The speech data is then analyzed to identify all possible triphone sequences in the stream, and the various units chosen for each context. During a later text-to-speech operation, the triphone contexts in the text are identified and the triphone-indexed phonemes in the database are searched to retrieve the best-matched candidates.
50 Citations
4 Claims
-
1. A method of synthesizing speech from text using a triphone unit selection database, the method comprising:
-
receiving input text; selecting a plurality of N phoneme units from the triphone unit selection database as candidate phonemes for synthesized speech based on the input text; applying a cost process to select a set of phonemes from the candidate phonemes; and synthesizing speech using the selected set of phonemes. - View Dependent Claims (2, 3, 4)
-
Specification