SMALL FOOTPRINT TEXT-TO-SPEECH ENGINE
First Claim
1. A computer readable medium storing computer-executable instructions that, when executed, cause one or more processors to perform acts comprising:
- generating a set of feature parameters for an input text, the set of feature parameters including static feature parameters and delta feature parameters;
deriving a saw-tooth stochastic trajectory that represents the speech characteristics of the input text based on the static feature parameters and the delta feature parameters;
producing a smoothed trajectory from the saw-tooth stochastic trajectory; and
generating synthesized speech based on the smoothed trajectory.
2 Assignments
0 Petitions
Accused Products
Abstract
Embodiments of small footprint text-to-speech engine are disclosed. In operation, the small footprint text-to-speech engine generates a set of feature parameters for an input text. The set of feature parameters includes static feature parameters and delta feature parameters. The small footprint text-to-speech engine then derives a saw-tooth stochastic trajectory that represents the speech characteristics of the input text based on the static feature parameters and the delta parameters. Finally, the small footprint text-to-speech engine produces a smoothed trajectory from the saw-tooth stochastic trajectory, and generates synthesized speech based on the smoothed trajectory.
12 Citations
20 Claims
-
1. A computer readable medium storing computer-executable instructions that, when executed, cause one or more processors to perform acts comprising:
-
generating a set of feature parameters for an input text, the set of feature parameters including static feature parameters and delta feature parameters; deriving a saw-tooth stochastic trajectory that represents the speech characteristics of the input text based on the static feature parameters and the delta feature parameters; producing a smoothed trajectory from the saw-tooth stochastic trajectory; and generating synthesized speech based on the smoothed trajectory. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer implemented method, comprising:
-
under control of one or more computing systems configured with executable instructions, generating a set of feature parameters for an input text using trained stream-dependent Hidden Markov Models (HMMs), the set of feature parameters including static feature parameters and delta feature parameters; deriving a saw-tooth stochastic trajectory that represents the speech characteristics of the input text based on the static feature parameters and the delta feature parameters. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
-
16. A system, comprising:
-
one or more processors; a memory that includes a plurality of computer-executable components, the plurality of computer-executable components comprising; a parameter generator to generate a set of feature parameters for an input text, the set of feature parameters including static feature parameters and delta feature parameters, and to derive a saw-tooth stochastic trajectory based on the static feature parameters and the delta feature parameters; and an audio smoother to producing a smoothed trajectory from the saw-tooth stochastic trajectory. - View Dependent Claims (17, 18, 19, 20)
-
Specification