Method and system for preselection of suitable units for concatenative speech
First Claim
1. A computing device that generates a database for use in speech synthesis, the computing device generating the database according to a method comprising:
- selecting a triphone sequence;
calculating a preselection cost for each 5-phoneme sequence where a unit of the 5-phoneme sequence is allowed to match any identically labeled phoneme in a database and at least two units of the 5-phoneme sequence vary over the entire phoneme universe; and
storing a group of the selected triphone sequences exhibiting the lowest costs in a triphone preselection cost database by;
determining a plurality of N least cost database units for the particular 5-phoneme context;
performing the union of the N least cost units for all combinations of the at least two units;
storing the union created in the step of performing the union in the triphone preselection cost database; and
repeating steps of selecting, calculating and storing a group of the selected triphone sequences for each possible triphone sequence.
10 Assignments
0 Petitions
Accused Products
Abstract
A system and method for improving the response time of text-to-speech synthesis utilizes “triphone contexts” (i.e., triplets comprising a central phoneme and its immediate context) as the basic unit, instead of performing phoneme-by-phoneme synthesis. The method comprises a method of generating a triphone preselection cost database for use in speech synthesis, the method comprising 1) selecting a triphone sequence u1-u2-u3, 2) calculating a preselection cost for each 5-phoneme sequence ua-u1-u2-u3-ub, where u2 is allowed to match any identically labeled phoneme in a database and the units ua and ub vary over the entire phoneme universe and 3) storing a group of the selected triphone sequences exhibiting the lowest costs in a triphone preselection cost database.
-
Citations
15 Claims
-
1. A computing device that generates a database for use in speech synthesis, the computing device generating the database according to a method comprising:
-
selecting a triphone sequence; calculating a preselection cost for each 5-phoneme sequence where a unit of the 5-phoneme sequence is allowed to match any identically labeled phoneme in a database and at least two units of the 5-phoneme sequence vary over the entire phoneme universe; and storing a group of the selected triphone sequences exhibiting the lowest costs in a triphone preselection cost database by; determining a plurality of N least cost database units for the particular 5-phoneme context; performing the union of the N least cost units for all combinations of the at least two units; storing the union created in the step of performing the union in the triphone preselection cost database; and repeating steps of selecting, calculating and storing a group of the selected triphone sequences for each possible triphone sequence. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method for generating a triphone preselection cost database for use in speech synthesis, the method comprising, for each of plurality of triphone sequences:
-
calculating a preselection cost for each 5-phoneme sequence, wherein a triphone sequence of the plurality of triphone sequences is included in each 5-phoneme sequence; storing a group of triphone sequences exhibiting the lowest costs in a triphone preselection cost database by; a) determining a plurality of N least cost database units for the particular 5-phoneme context; b) performing the union of the N least cost units for all combinations of two selected units from the 5-phoneme sequence; and c) storing the union created in step b) in the triphone preselection cost database. - View Dependent Claims (7, 8, 9, 10)
-
-
11. A computer-readable medium storing instructions for controlling a computing device to generate a triphone preselection cost database for use in speech synthesis, the instructions comprising, for each of plurality of triphone sequences:
-
calculating a preselection cost for each 5-phoneme sequence, wherein a triphone sequence of the plurality of triphone sequences is included in each 5-phoneme sequence; storing a group of triphone sequences exhibiting the lowest costs in a triphone preselection cost database by; a) determining a plurality of N least cost database units for the particular 5-phoneme context; b) performing the union of the N least cost units for all combinations of two selected units from the 5-phoneme sequence; and c) storing the union created in step b) in the triphone preselection cost database. - View Dependent Claims (12, 13, 14, 15)
-
Specification