Phrase splicing and variable substitution using a trainable speech synthesizer
First Claim
1. A method for providing generation of speech comprising the steps of:
- providing splice phrases including recorded human speech to be employed in synthesizing speech;
constructing a splice file dictionary including every word and every word sequence for the splice phrases and including a phone sequence associated with every word and every word sequence for the splice phrases;
providing input to be acoustically produced;
comparing the input to training data in the splice file dictionary to identify one of words and word sequences corresponding to the input for constructing a phone sequence;
comparing the input to a pronunciation dictionary when the input is not found in the training data of the splice file dictionary;
identifying a segment sequence using a first search algorithm to construct output speech according to the phone sequence; and
concatenating segments of the segment sequence and modifying characteristics of the segments to be substantially equal to requested characteristics.
2 Assignments
0 Petitions
Accused Products
Abstract
In accordance with the present invention, a method for providing generation of speech includes the steps of providing input to be acoustically produced, comparing the input to training data or application specific splice files to identify one of words and word sequences corresponding to the input for constructing a phone sequence, using a search algorithm to identify a segment sequence to construct output speech according to the phone sequence and concatenating segments and modifying characteristics of the segments to be substantially equal to requested characteristics. Application specific data is advantageously used to make pertinent information available to synthesize both the phone sequence and the output speech. Also, described is a system for performing operations in accordance with the disclosure.
-
Citations
27 Claims
-
1. A method for providing generation of speech comprising the steps of:
-
providing splice phrases including recorded human speech to be employed in synthesizing speech;
constructing a splice file dictionary including every word and every word sequence for the splice phrases and including a phone sequence associated with every word and every word sequence for the splice phrases;
providing input to be acoustically produced;
comparing the input to training data in the splice file dictionary to identify one of words and word sequences corresponding to the input for constructing a phone sequence;
comparing the input to a pronunciation dictionary when the input is not found in the training data of the splice file dictionary;
identifying a segment sequence using a first search algorithm to construct output speech according to the phone sequence; and
concatenating segments of the segment sequence and modifying characteristics of the segments to be substantially equal to requested characteristics. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method for providing generation of speech comprising the steps of:
-
providing splice phrases including recorded human speech to be employed in synthesizing speech;
constructing a splice file dictionary including every word and every word sequence for the splice phrases and including a phone sequence associated with every word and every word sequence for the splice phrases;
providing input to be acoustically produced;
comparing the input to application specific splice files in the splice file dictionary to identify one of words and word sequences corresponding to the input for constructing a phone sequence;
augmenting a generic segment inventory by adding segments corresponding to the identified words and word sequences;
identifying a segment sequence, using a first search algorithm and the augmented generic segment inventory to construct output speech according to the phone sequence; and
concatenating the segments of the segment sequence and modifying characteristics of the segments of the segment sequence to be substantially equal to requested characteristics. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A system for generating synthetic speech comprising:
-
a splice file dictionary including splice phrases of recorded human speech to be employed in synthesizing speech the splice file dictionary including every word and every word sequence for the splice phrases and including a phone sequence associated with every word and every word sequence for the splice phrases;
means for providing input to be acoustically produced;
means for comparing the input to application specific splice files in the splice file dictionary to identify one of words and word sequences corresponding to the input for constructing a phone sequence;
means for augmenting a generic segment inventory by adding segments corresponding to sentences including the identified words and word sequences;
a synthesizer for utilizing a first search algorithm and the augmented generic inventory to identify a segment sequence to construct output speech according to the phone sequence; and
means for concatenating segments of the segment sequence and modifying characteristics of the segments of the segment sequence to be substantially equal to requested characteristics. - View Dependent Claims (22, 23, 24, 25, 26, 27)
-
Specification