Method and apparatus for time-synchronized translation and synthesis of natural-language speech
First Claim
1. A system for translating a source language into at least one target language, comprising:
- a phrase-spotting system for identifying a spoken phrase from a restricted domain of phrases;
a set of prerecorded translations of said restricted domain of phrases; and
a playback mechanism for reproducing said spoken phrase in said at least one target language, wherein a duration of said prerecorded translation is adjusted to approximately match a duration of said spoken phrase.
3 Assignments
0 Petitions
Accused Products
Abstract
A multi-lingual time-synchronized translation system and method provide automatic time-synchronized spoken translations of spoken phrases. The multi-lingual time-synchronized translation system includes a phrase-spotting mechanism, optionally, a language understanding mechanism, a translation mechanism, a speech output mechanism and an event measuring mechanism. The phrase-spotting mechanism identifies a spoken phrase from a restricted domain of phrases. The language understanding mechanism, if present, maps the identified phrase onto a small set of formal phrases. The translation mechanism maps the formal phrase onto a well-formed phrase in one or more target languages. The speech output mechanism produces high-quality output speech using the output of the event measuring mechanism for time synchronization. The event-measuring mechanism measures the duration of various key events in the source phrase. Event duration could be, for example, the overall duration of the input phrase, the duration of the phrase with interword silences omitted, or some other relevant durational features. The present invention recognizes the quality improvements can be achieved by restricting the task domain under consideration.
-
Citations
57 Claims
-
1. A system for translating a source language into at least one target language, comprising:
-
a phrase-spotting system for identifying a spoken phrase from a restricted domain of phrases;
a set of prerecorded translations of said restricted domain of phrases; and
a playback mechanism for reproducing said spoken phrase in said at least one target language, wherein a duration of said prerecorded translation is adjusted to approximately match a duration of said spoken phrase. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system for translating a source language into at least one target language, comprising:
-
a phrase-spotting system for identifying a spoken phrase from a restricted domain of phrases, said restricted domain of phrases having a static component and a dynamic component;
a set of prerecorded translations of said static components and said dynamic components of said restricted domain of phrases; and
a playback mechanism for reproducing said spoken phrase in said at least one target language using said prerecorded translations of said static components and said dynamic components, wherein the duration of said prerecorded translation is adjusted to approximately match the duration of said spoken phrase. - View Dependent Claims (9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A system for translating a source language into at least one target language, comprising:
-
a natural-language understanding system that infers a phrase in an underlying formal language from a spoken phrase;
a text production mechanism in which a formal language phrase is converted to natural text in said at least one target language;
a set of prerecorded translations in said at least one target language; and
a playback mechanism driven by said natural text for reproducing said spoken phrase in said at least one target language using said prerecorded translations, wherein the duration of said prerecorded translation is adjusted to approximately match the duration of said spoken phrase. - View Dependent Claims (18, 19, 20, 21, 22, 23)
-
-
24. A system for translating a source language into at least one target language, comprising:
-
a phrase-spotting system for identifying a spoken phrase from a restricted domain of phrases;
a set of prerecorded translations of said restricted domain of phrases; and
a playback mechanism for reproducing said spoken phrase in said at least one target language, wherein the duration of said spoken phrase or said prerecorded translation is adjusted to synchronize said spoken phrase and said prerecorded translation. - View Dependent Claims (25, 26)
-
-
27. A method for translating a source language into at least one target language, comprising:
-
identifying a spoken phrase from a restricted domain of phrases;
obtaining a prerecorded translation of said spoken phrase; and
reproducing said spoken phrase in said at least one target language, wherein a duration of said prerecorded translation is adjusted to approximately match a duration of said spoken phrase. - View Dependent Claims (28, 29, 30, 31, 32)
-
-
33. A method for translating a source language into at least one target language, comprising:
-
identifying a spoken phrase from a restricted domain of phrases, said restricted domain of phrases having a static component and a dynamic component;
obtaining a prerecorded translation of said static components and said dynamic components of said spoken phrase; and
reproducing said spoken phrase in said at least one target language using said prerecorded translations of said static components and said dynamic components, wherein the duration of said prerecorded translation is adjusted to approximately match the duration of said spoken phrase. - View Dependent Claims (34, 35, 36, 37, 38, 39, 40)
-
-
41. A method for translating a source language into at least one target language, comprising:
-
inferring a phrase in an underlying formal language from a spoken phrase using a natural-language understanding system;
converting a formal language phrase to natural text in said at least one target language;
obtaining a prerecorded translation in said at least one target language; and
reproducing said spoken phrase in said at least one target language using said prerecorded translations and driven by said natural text, wherein the duration of said prerecorded translation is adjusted to approximately match the duration of said spoken phrase. - View Dependent Claims (42, 43, 44, 45, 46, 47, 48)
-
-
49. A method for translating a source language into at least one target language, comprising:
-
identifying a spoken phrase from a restricted domain of phrases;
obtaining a prerecorded translation of said spoken phrase; and
reproducing said spoken phrase in said at least one target language, wherein the duration of said spoken phrase or said prerecorded translation is adjusted to synchronize said spoken phrase and said prerecorded translation. - View Dependent Claims (50, 51)
-
-
52. A system for translating a source language into at least one of a plurality of target languages, comprising:
-
a memory that stores computer-readable code; and
a processor operatively coupled to said memory, said processor configured to implement said computer-readable code, said computer-readable code configured to;
identify a spoken phrase from a restricted domain of phrases;
obtain a prerecorded translation of said spoken phrase; and
reproduce said spoken phrase in said at least one target language, wherein a duration of said prerecorded translation is adjusted to approximately match a duration of said spoken phrase.
-
-
53. An article of manufacture, comprising:
-
a computer readable medium having computer readable code means embodied thereon, said computer readable program code means comprising;
a step to identify a spoken phrase from a restricted domain of phrases;
a step to obtain a prerecorded translation of said spoken phrase; and
a step to reproduce said spoken phrase in said at least one target language, wherein a duration of said prerecorded translation is adjusted to approximately match a duration of said spoken phrase.
-
-
54. A system for translating a source language into at least one of a plurality of target languages, comprising:
-
a memory that stores computer-readable code; and
a processor operatively coupled to said memory, said processor configured to implement said computer-readable code, said computer-readable code configured to;
identify a spoken phrase from a restricted domain of phrases, said restricted domain of phrases having a static component and a dynamic component;
obtain a prerecorded translation of said static components and said dynamic components of said spoken phrase; and
reproduce said spoken phrase in said at least one target language using said prerecorded translations of said static components and said dynamic components, wherein the duration of said prerecorded translation is adjusted to approximately match the duration of said spoken phrase.
-
-
55. An article of manufacture, comprising:
-
a computer readable medium having computer readable code means embodied thereon, said computer readable program code means comprising;
a step to identify a spoken phrase from a restricted domain of phrases, said restricted domain of phrases having a static component and a dynamic component;
a step to obtain a prerecorded translation of said static components and said dynamic components of said spoken phrase; and
a step to reproduce said spoken phrase in said at least one target language using said prerecorded translations of said static components and said dynamic components, wherein the duration of said prerecorded translation is adjusted to approximately match the duration of said spoken phrase.
-
-
56. A system for translating a source language into at least one of a plurality of target languages, comprising:
-
a memory that stores computer-readable code; and
a processor operatively coupled to said memory, said processor configured to implement said computer-readable code, said computer-readable code configured to;
infer a phrase in an underlying formal language from a spoken phrase using a natural-language understanding system;
convert a formal language phrase to natural text in said at least one target language;
obtain a prerecorded translation in said at least one target language; and
reproduce said spoken phrase in said at least one target language using said prerecorded translations and driven by said natural text, wherein the duration of said prerecorded translation is adjusted to approximately match the duration of said spoken phrase.
-
-
57. An article of manufacture, comprising:
-
a computer readable medium having computer readable code means embodied thereon, said computer readable program code means comprising;
a step to infer a phrase in an underlying formal language from a spoken phrase using a natural-language understanding system;
a step to convert a formal language phrase to natural text in said at least one target language;
a step to obtain a prerecorded translation in said at least one target language; and
a step to reproduce said spoken phrase in said at least one target language using said prerecorded translations and driven by said natural text, wherein the duration of said prerecorded translation is adjusted to approximately match the duration of said spoken phrase.
-
Specification