Speech samples library for text-to-speech and methods and apparatus for generating and using same
First Claim
Patent Images
1. A method for converting text into speech with a speech sample library, comprising:
- providing an input text;
converting the input text to a sequence of triphones;
retrieving phonemic contexts of the sequence of triphones;
determining musical parameters characterizing each phoneme in the sequence of triphones;
predicting a set of numerical targets for the determined musical parameters, wherein the set of numerical targets is provided for each of the musical parameters;
detecting, in the speech sample library, pre-stored speech segments having at least the determined musical parameters of each phoneme in the sequence of triphones based on the phonemic contexts and the predicted set of numerical targets for the determined musical parameters which lie within a range of musical parameters of the pre-stored speech segments, wherein the detection of the pre-stored speech segments further includes searching the speech sample library for at least one of a central phoneme, phonemic context, and a musical index indicating at least one range of at least one of the musical parameters within which at least one of the numerical targets lies; and
concatenating the detected speech segments.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for converting translating text into speech with a speech sample library is provided. The method comprises converting translating an input text to a sequence of triphones; determining musical parameters of each phoneme in the sequence of triphones; detecting, in the speech sample library, speech segments having at least the determined musical parameters; and concatenating the detected speech segments.
-
Citations
14 Claims
-
1. A method for converting text into speech with a speech sample library, comprising:
-
providing an input text; converting the input text to a sequence of triphones; retrieving phonemic contexts of the sequence of triphones; determining musical parameters characterizing each phoneme in the sequence of triphones; predicting a set of numerical targets for the determined musical parameters, wherein the set of numerical targets is provided for each of the musical parameters; detecting, in the speech sample library, pre-stored speech segments having at least the determined musical parameters of each phoneme in the sequence of triphones based on the phonemic contexts and the predicted set of numerical targets for the determined musical parameters which lie within a range of musical parameters of the pre-stored speech segments, wherein the detection of the pre-stored speech segments further includes searching the speech sample library for at least one of a central phoneme, phonemic context, and a musical index indicating at least one range of at least one of the musical parameters within which at least one of the numerical targets lies; and concatenating the detected speech segments. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An apparatus for converting text into speech with a speech sample library, comprising:
-
an input unit for providing an input text; a parser for converting the text into a sequence of speech segments; a prosody predictor for predicting musical parameters of each phoneme in the sequence of triphones and a set of numerical targets for each of the predicted musical parameters of each phoneme in the sequence of triphones based on phonemic contexts and the set of numerical targets for the determined musical parameters which lie within a range of musical parameters of the pre-stored speech segments, wherein the set of numerical targets is provided for each of the musical parameters; and a search module for detecting, in the speech sample library, pre-stored speech segments having at least the determined musical parameter, wherein the search module is further configured to search in the speech sample library for at least one of a central phoneme, phonemic context, and a musical index indicating at least one range of at least one of the musical parameters within which at least of the numerical targets lies. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
Specification