METHOD AND APPARATUS FOR GENERATING SYNTHETIC SPEECH WITH CONTRASTIVE STRESS
First Claim
1. A method for providing speech output for a speech-enabled application, the method comprising:
- receiving from the speech-enabled application a text input comprising a text transcription of a desired speech output;
generating, using at least one computer system, an audio speech output corresponding to at least a portion of the text input, the audio speech output comprising at least one portion carrying contrastive stress to contrast with at least one other portion of the audio speech output; and
providing the audio speech output for the speech-enabled application.
7 Assignments
0 Petitions
Accused Products
Abstract
Techniques for generating synthetic speech with contrastive stress. In one aspect, a speech-enabled application generates a text input including a text transcription of a desired speech output, and inputs the text input to a speech synthesis system. The synthesis system generates an audio speech output corresponding to at least a portion of the text input, with at least one portion carrying contrastive stress, and provides the audio speech output for the speech-enabled application. In another aspect, a speech-enabled application inputs a plurality of text strings, each corresponding to a portion of a desired speech output, to a software module for rendering contrastive stress. The software module identifies a plurality of audio recordings that render at least one portion of at least one of the text strings as speech carrying contrastive stress. The speech-enabled application generates an audio speech output corresponding to the desired speech output using the audio recordings.
-
Citations
57 Claims
-
1. A method for providing speech output for a speech-enabled application, the method comprising:
-
receiving from the speech-enabled application a text input comprising a text transcription of a desired speech output; generating, using at least one computer system, an audio speech output corresponding to at least a portion of the text input, the audio speech output comprising at least one portion carrying contrastive stress to contrast with at least one other portion of the audio speech output; and providing the audio speech output for the speech-enabled application. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. Apparatus for providing speech output for a speech-enabled application, the apparatus comprising:
-
a memory storing a plurality of processor-executable instructions; and at least one processor, operatively coupled to the memory, that executes the instructions to; receive from the speech-enabled application a text input comprising a text transcription of a desired speech output; generate an audio speech output corresponding to at least a portion of the text input, the audio speech output comprising at least one portion carrying contrastive stress to contrast with at least one other portion of the audio speech output; and provide the audio speech output for the speech-enabled application. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. At least one non-transitory computer-readable storage medium encoded with a plurality of computer-executable instructions that, when executed, perform a method for providing speech output for a speech-enabled application, the method comprising:
-
receiving from the speech-enabled application a text input comprising a text transcription of a desired speech output; generating an audio speech output corresponding to at least a portion of the text input, the audio speech output comprising at least one portion carrying contrastive stress to contrast with at least one other portion of the audio speech output; and providing the audio speech output for the speech-enabled application. - View Dependent Claims (24, 25, 26, 27, 28, 29, 30, 31, 32, 33)
-
-
34. A method for providing speech output via a speech-enabled application, the method comprising:
-
generating, using at least one computer system executing the speech-enabled application, a text input comprising a text transcription of a desired speech output; inputting the text input to at least one speech synthesis engine; receiving from the at least one speech synthesis engine an audio speech output corresponding to at least a portion of the text input, the audio speech output comprising at least one portion carrying contrastive stress to contrast with at least one other portion of the audio speech output; and providing the audio speech output to at least one user of the speech-enabled application. - View Dependent Claims (35, 36, 37, 38, 39, 40, 41)
-
-
42. Apparatus for providing speech output via a speech-enabled application, the apparatus comprising:
-
a memory storing a plurality of processor-executable instructions; and at least one processor, operatively coupled to the memory, that executes the instructions to; generate a text input comprising a text transcription of a desired speech output; input the text input to at least one speech synthesis engine; receive from the at least one speech synthesis engine an audio speech output corresponding to at least a portion of the text input, the audio speech output comprising at least one portion carrying contrastive stress to contrast with at least one other portion of the audio speech output; and provide the audio speech output to at least one user of the speech-enabled application. - View Dependent Claims (43, 44, 45, 46, 47, 48, 49)
-
-
50. At least one non-transitory computer-readable storage medium encoded with a plurality of computer-executable instructions that, when executed, perform a method for providing speech output via a speech-enabled application, the method comprising:
-
generating a text input comprising a text transcription of a desired speech output; inputting the text input to at least one speech synthesis engine; receiving from the at least one speech synthesis engine an audio speech output corresponding to at least a portion of the text input, the audio speech output comprising at least one portion carrying contrastive stress to contrast with at least one other portion of the audio speech output; and providing the audio speech output to at least one user of the speech-enabled application. - View Dependent Claims (51, 52, 53, 54, 55, 56, 57)
-
Specification