SYSTEM AND METHOD FOR SPEECH-TO-SPEECH TRANSLATION
First Claim
1. A speech-to-speech translation system comprising:
- a processor;
an audio input device in electrical communication with the processor, the input device configured to receive audio input including an input speech sample of a user in a first language;
an audio output device in electrical communication with the processor, the audio output device configured to output audio including a translation of the input speech sample translated to a second language, wherein the output audio comprises basic sound units in the voice of the user; and
a computer-readable storage medium comprising;
a voice recognition module configured to receive the input speech sample and convert the input speech sample to text in the first language;
a translation module configured to translate the text in the first language to text in a second language; and
a speech synthesis module configured to receive the text in the second language and determine corresponding basic sound units in the voice of the user contained within a user phonetic dictionary to thereby generate speech in the second language in the unique voice of the user.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed herein are systems and methods for receiving an input speech sample in a first language and outputting a translated speech sample in a second language in the unique voice of a user. According to several embodiments, a translation system includes a translation mode performing the above functions and a training mode for developing a voice recognition database and a user phonetic dictionary. A speech recognition module uses a voice recognition database to recognize and transcribe the input speech samples in a first language. The text in the first language is translated to text in a second language, and a speech synthesizer develops an output speech in the unique voice of the user utilizing a user phonetic dictionary. The user phonetic dictionary may contain basic sound units, including phones, diphones, triphones, and/or words.
195 Citations
23 Claims
-
1. A speech-to-speech translation system comprising:
-
a processor; an audio input device in electrical communication with the processor, the input device configured to receive audio input including an input speech sample of a user in a first language; an audio output device in electrical communication with the processor, the audio output device configured to output audio including a translation of the input speech sample translated to a second language, wherein the output audio comprises basic sound units in the voice of the user; and a computer-readable storage medium comprising; a voice recognition module configured to receive the input speech sample and convert the input speech sample to text in the first language; a translation module configured to translate the text in the first language to text in a second language; and a speech synthesis module configured to receive the text in the second language and determine corresponding basic sound units in the voice of the user contained within a user phonetic dictionary to thereby generate speech in the second language in the unique voice of the user. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A computer-implemented method for translating speech from a first language to a second language, the method comprising:
-
receiving an input speech sample on a computer system via an input device, the input speech sample spoken by a user in a first language; the computer system recognizing the input speech sample in the first language; the computer system converting the input speech sample in the first language to text in the first language; the computer system translating the text in the first language to text in a second language; the computer system synthesizing the text in the second language into speech in the second language by determining corresponding basic sound units within a user phonetic dictionary; and the computer system generating an output of the speech in the second language in the unique voice. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
-
22. A system for translating speech from a first language to a second language, comprising:
-
means to receive a input speech in a first language in a unique voice; means to convert the input speech in the first language to text in the first language; means to translate the text in the first language to text in a second language; means to synthesize the text in the second language into speech in the second language by determining corresponding basic sound units within a user phonetic dictionary; means to output the speech in the second language in the unique voice. - View Dependent Claims (23)
-
Specification