SYSTEMS AND METHODS FOR SPEECH PREPROCESSING IN TEXT TO SPEECH SYNTHESIS
First Claim
1. A method for synthesizing speech in a target language based on a text string, the method comprising:
- determining a source language in which the text string has originated;
obtaining a source set of phonemes in the source language of the text string;
obtaining a target set of phonemes in the target language based on the source set of phonemes; and
providing synthesized speech based on the target set of phonemes.
1 Assignment
0 Petitions
Accused Products
Abstract
Algorithms for synthesizing speech used to identify media assets are provided. Speech may be selectively synthesized form text strings associated with media assets. A text string may be normalized and its native language determined for obtaining a target phoneme for providing human-sounding speech in a language (e.g., dialect or accent) that is familiar to a user. The algorithms may be implemented on a system including several dedicated render engines. The system may be part of a back end coupled to a front end including storage for media assets and associated synthesized speech, and a request processor for receiving and processing requests that result in providing the synthesized speech. The front end may communicate media assets and associated synthesized speech content over a network to host devices coupled to portable electronic devices on which the media assets and synthesized speech are played back.
390 Citations
16 Claims
-
1. A method for synthesizing speech in a target language based on a text string, the method comprising:
-
determining a source language in which the text string has originated; obtaining a source set of phonemes in the source language of the text string; obtaining a target set of phonemes in the target language based on the source set of phonemes; and providing synthesized speech based on the target set of phonemes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. An apparatus for synthesizing speech in a target language based on a text string, the apparatus comprising:
-
a pre-processor for determining a native language in which the text string has originated, obtaining a plurality of target phonemes based on a plurality of native phonemes, the native phonemes being phonemes in the native language of the text string, and the target phonemes being phonemes in the target language associated with the native phonemes; and a synthesizer coupled to the pre-processor for synthesizing the target phonemes to speech. - View Dependent Claims (11)
-
-
12. A method for synthesizing speech based on a text string, the method comprising:
-
receiving a first text string associated with a media asset; identifying an omitted word based on the first text string; determining a confidence of the identified omitted word; creating a second text string using the identified omitted word if the determined confidence exceeds a threshold; and providing synthesized speech based on the second text string. - View Dependent Claims (13, 14, 15, 16)
-
Specification