Method and apparatus for speech translation with unrecognized segments
First Claim
Patent Images
1. A speech translation system, comprising:
- a speech recognizer, which produces based on digitized speech (i) a recognized text word sequence including one or more labels identifying segments of the digitized speech, and (ii) segment position information for said one or more labels identifying said segments of said digitized speech;
a language translation component coupled to the speech recognizer, the language translation component produces a translated text word sequence that includes said one or more labels identifying said segments of said digitized speech based on the recognized text word sequence;
a speech segment storage component coupled to the speech recognizer, the speech segment storage component stores said digitized speech and said segment position information and produces said segments of said digitized speech based on said segment position information; and
a speech synthesizer coupled to the language translation component and the speech segment storage component, the speech synthesizer produces, based on the translated text word sequence, target language output speech including said segments of said digitized speech in place of said one or more labels.
4 Assignments
0 Petitions
Accused Products
Abstract
A method for the treatment of unrecognized words and phrases in spoken language translation systems is applicable to open-ended word or phrase categories, such as proper names, and involves the demarcation of one or more unrecognized segments from the input speech signal and splicing these segments into the appropriate position of the synthesized speech signal for the translation.
-
Citations
6 Claims
-
1. A speech translation system, comprising:
-
a speech recognizer, which produces based on digitized speech (i) a recognized text word sequence including one or more labels identifying segments of the digitized speech, and (ii) segment position information for said one or more labels identifying said segments of said digitized speech; a language translation component coupled to the speech recognizer, the language translation component produces a translated text word sequence that includes said one or more labels identifying said segments of said digitized speech based on the recognized text word sequence; a speech segment storage component coupled to the speech recognizer, the speech segment storage component stores said digitized speech and said segment position information and produces said segments of said digitized speech based on said segment position information; and a speech synthesizer coupled to the language translation component and the speech segment storage component, the speech synthesizer produces, based on the translated text word sequence, target language output speech including said segments of said digitized speech in place of said one or more labels. - View Dependent Claims (2)
-
-
3. A method of translating speech, comprising the following steps:
-
(A) making from digitized speech a recognized text word sequence including one or more labels identifying segments of said digitized speech and segment position information; (B) translating the recognized text word sequence into a translated text word sequence that includes said one or more labels identifying said segments of said digitized speech; and (C) synthesizing target language output speech from the translated text word sequence wherein said segments of said digitized speech are substituted in place of said one or more labels based on said segment position information. - View Dependent Claims (4, 5)
-
-
6. A speech translation system, comprising:
-
means for making from digitized speech a recognized text word sequence including one or more labels identifying segments of said digitized speech and segment position information for said one or more labels; means for translating the recognized text word sequence into a translated text word sequence that includes said one or more labels identifying said segments of said digitized speech; and means for synthesizing target language output speech from the translated text word sequence wherein said segments of said digitized speech are substituted in place of said one or more labels based on said segment position information.
-
Specification