In-Call Translation

US 20150347399A1
Filed: 02/11/2015
Published: 12/03/2015
Est. Priority Date: 05/27/2014
Status: Abandoned Application

First Claim

Patent Images

1. A language translation relay system for use in a communication system, the communication system for effecting a voice or video call between at least a source user speaking a source language and a target user speaking a target language, the relay system comprising:

an input configured to receive call audio of the call from a remote source user device of the source user via a communication network of the communication system, the call audio comprising speech of the source user in the source language;

a speech recognition component configured to perform an automatic speech recognition procedure on the call audio;

a translation component configured to generate a translation of the source user'"'"'s speech in the target language using the results of the speech recognition procedure, the translation comprising a translated synthetic speech audio version of the source user'"'"'s speech in the target language for playing out at the target user device, the synthetic speech generated based on the results of the speech recognition procedure;

a mixing component configured to mix the synthetic speech with the source user'"'"'s call audio and/or with translated audio of the target user'"'"'s speech in the source language, thereby generating a mixed audio signal; and

an output configured to transmit the mixed audio signal to at least a remote target user device of the target user via the communication network for outputting to the target user during the call.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Call audio of a call between a source user speaking a source language and a target user speaking a target language is received from a remote source user device of a source user via a communication network of a communication system, the call audio comprising speech of the source user in the source language. An automatic speech recognition procedure is performed on the call audio. A translation of the source user'"'"'s speech is generated in the target language using the results of the speech recognition procedure. A translated synthetic speech audio version of the source user'"'"'s speech is mixed with the source user'"'"'s call audio and/or with translated audio of the target user'"'"'s speech in the source language. The mixed audio signal is transmitted to a remote target user device of the target user via the communication network for outputting to at least the target user during the call.

69 Citations

View as Search Results

20 Claims

1. A language translation relay system for use in a communication system, the communication system for effecting a voice or video call between at least a source user speaking a source language and a target user speaking a target language, the relay system comprising:
- an input configured to receive call audio of the call from a remote source user device of the source user via a communication network of the communication system, the call audio comprising speech of the source user in the source language;
  
  a speech recognition component configured to perform an automatic speech recognition procedure on the call audio;
  
  a translation component configured to generate a translation of the source user'"'"'s speech in the target language using the results of the speech recognition procedure, the translation comprising a translated synthetic speech audio version of the source user'"'"'s speech in the target language for playing out at the target user device, the synthetic speech generated based on the results of the speech recognition procedure;
  
  a mixing component configured to mix the synthetic speech with the source user'"'"'s call audio and/or with translated audio of the target user'"'"'s speech in the source language, thereby generating a mixed audio signal; and
  
  an output configured to transmit the mixed audio signal to at least a remote target user device of the target user via the communication network for outputting to the target user during the call.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. A language translation relay system according to claim 1 in which users of the communication system are uniquely identified by associated user identifiers, the relay system configured to implement a translator agent, the translator agent also being uniquely identified by an associated user identifier, thereby facilitating communication with the agent substantially as if it were another user of the communication system;
    - wherein the translator agent is configured, responsive to a translation request requesting that the translator agent participate in the call, to effect the speech recognition procedure and the generation of the translation whilst participating in the call.
  - 3. A language translation relay system according to claim 1 wherein the translation further comprises a translated text version of the source user'"'"'s speech in the target language for displaying at the target user device and/or for converting to synthetic speech at the target user device, the target language text generated based on the results of the speech recognition procedure, wherein the output is further configured to transmit the translated text version to the target user device.
  - 4. A language relay translation system according to claim 1 embodied by one or more servers of the communication network.
  - 5. A language translation relay system according to claim 1 comprising a further input configured to receive further call audio of the call from the target user device via the network, the further call audio comprising speech of the target user in the target language;
    - wherein the call audio and the further call audio are received as separate audio signals and the relay system is configured to generate, separately from the translation of the source user'"'"'s speech, a further translation of the target user'"'"'s speech in the source language to be transmitted to the source user.
  - 6. A language translation system according to claim 5 wherein the call has at least a third user speaking a third language as an additional participant, the translator relay system configured to generate, separately from the translations of the source and target users'"'"' speech, a third translation of the third user'"'"'s speech in the source language to be transmitted to at least the source user and/or a fourth translation of the third user'"'"'s speech in the target language to be transmitted to at least the target user.
  - 7. A language translation relay system according to claim 1 comprising another output configured to transmit information pertaining to the results of the speech recognition procedure to the source user device of the source user and/or to the target user device of the target user.
  - 8. A language relay translation system according to claim 7 comprising another input connected to receive feedback data via the network from the source user device of the source user, the feedback data conveying source user feedback pertaining to the results of the speech recognition procedure, wherein the speech recognition component is configured based on the received feedback data.
  - 9. A translation relay system according to claim 7 wherein the speech recognition procedure is configured, for at least one interval of speech activity by the source user, to generate partial speech recognition results whilst that speech activity is ongoing before generating final speech recognition results when that speech activity is completed;
    - andwherein the translation component is configured to generate the translation using the final results but the other output is configured to transmit information pertaining to the partial results before the translation is generated for outputting to the source user, thereby inviting the source user to influence the subsequent translation when inaccuracies are present in the partial results.
  - 10. A language translation relay system according to claim 1 wherein the translation is turn-based, being generated per interval of source speech activity.
  - 11. A language translation relay system according to claim 1 wherein the translation is substantially contemporaneous with the source speech, being generated for at least one interval of source speech activity per multiple segments of that interval.
  - 12. A language translation relay system according to claim 1 wherein the target user is one of multiple target users participating in the call who speak the target language, and the output is configured to transmit the translation in the target language to the multiple target users.

13. A method performed at a language translation relay system of a communication system, the communication system for effecting a voice or video call between at least a source user speaking a source language and a target user speaking a target language, the method comprising:
- receiving call audio of the call from a remote source user device of the source user via a communication network of the communication system, the call audio comprising speech of the source user in the source language;
  
  performing an automatic speech recognition procedure on the call audio;
  
  generating a translation of the source user'"'"'s speech in the target language using the results of the speech recognition procedure, the translation comprising a translated synthetic speech audio version of the source user'"'"'s speech in the target language for playing out at the target user device, the synthetic speech generated based on the results of the speech recognition procedure;
  
  mixing the synthetic speech with the source user'"'"'s call audio and/or with translated audio of the target user'"'"'s speech in the source language, thereby generating a mixed audio signal; and
  
  transmitting the mixed audio signal to a remote target user device of the target user via the communication network for outputting to at least the target user during the call.
- View Dependent Claims (14, 15, 16, 17, 18, 19)
- - 14. A method according to claim 13 in which users of the communication system are uniquely identified by associated user identifiers, the relay system holding computer code configured to implement a translator agent, the translator agent also being uniquely identified by an associated user identifier, thereby facilitating communication with the agent substantially as if it were another user of the communication system;
    - wherein the method comprises;
      
      receiving a translation request requesting that the translator agent participate in the call; and
      
      responsive to receiving the request, including an instance of the translator agent as a participant in the call, wherein the translator agent instance is configured when thus included to effect the speech recognition procedure and the generation of the translation.
  - 15. A method according to claim 13 wherein the step of generating further comprises:
    - generating a translated text version of the source user'"'"'s speech in the target language transmitting the translated text to the target user device for displaying at the target user device and/or for converting to synthetic speech at the target user device.
  - 16. A method according to claim 13 wherein the language translation relay system is embodied by one or more servers of the communication network.
  - 17. A method according to claim 13 comprising receiving further call audio of the call from the target user device via the network, the further call audio comprising speech of the target user in the target language;
    - wherein the call audio and the further call audio are received as separate audio signals and the method comprises generating, separately from the translation of the source user'"'"'s speech, a further translation of the target user'"'"'s speech in the source language to be transmitted to the source user.
  - 18. A method according to claim 17 wherein the call has at least a third user speaking a third language as an additional participant, and the method comprises generating, separately from the translations of the source and target users'"'"' speech, a third translation of the third user'"'"'s speech in the source language to be transmitted to at least the source user and/or a fourth translation of the third user'"'"'s speech in the target language to be transmitted to at least the target user.
  - 19. A method according to claim 13 comprising transmitting information pertaining to the results of the speech recognition procedure to the source user device of the source user and/or to the target user device of the target user.

20. A computer program product comprising computer code stored on a computer readable storage medium for execution on a language translation relay system of a communication system, the communication system for effecting a voice or video call between at least a source user speaking a source language and a target user speaking a target language, the code configured when executed to cause operations of:
- receiving call audio of the call from a remote source user device of the source user via a communication network of the communication system, the call audio comprising speech of the source user in the source language;
  
  performing an automatic speech recognition procedure on the call audio;
  
  generating a translation of the source user'"'"'s speech in the target language using the results of the speech recognition procedure, the translation comprising a translated synthetic speech audio version of the source user'"'"'s speech in the target language for playing out at the target user device, the synthetic speech audio version generated based on the results of the speech recognition procedure;
  
  mixing the synthetic speech with the source user'"'"'s call audio and/or with translated audio of the target user'"'"'s speech in the source language, thereby generating a mixed audio signal; and
  
  transmitting the mixed audio signal to at least a remote target user device of the target user via the communication network for outputting to the target user during the call.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Inventors
Aue, Anthony, Menezes, Arul A., Lindblom, Jonas Nils, Furesj, Fredrik, Greborio, Pierre P.N.

Application Number

US14/620,142
Publication Number

US 20150347399A1
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G06F 40/58   Use of machine translation,...

H04M 11/10   with dictation recording an...

H04M 2201/39   using speech synthesis spee...

H04M 2201/40   using speech recognition sp...

H04M 2203/2061   Language aspects

H04M 2242/12   Language recognition, selec...

H04M 3/42   Systems providing special s...

H04M 3/4936   Speech interaction details ...

H04W 4/14   Short messaging services, e...

H04W 4/18   Information format or conte...

In-Call Translation

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

69 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

In-Call Translation

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

69 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links