Auto-translation for multi user audio and video

US 10,372,831 B2
Filed: 07/31/2017
Issued: 08/06/2019
Est. Priority Date: 12/12/2011
Status: Active Grant

First Claim

Patent Images

1. A system comprising:

a communication server configured to communicate with two or more end user devices, each including an audio input device and an audio output device, and the communication server is further configured to perform operations comprising;

receiving audio data signals representing communication in a first spoken language from a first end user device; and

transmitting the audio data signals to a second end user device via a first communication channel, the audio data signals when received by the second end user device causing the second end user device to embed a language preference for the second end user device into the audio data signals; and

a translation services server configured to communicate with the end user devices and perform operations comprising;

after the second end user device embeds the language preference into the audio data signals, receiving the audio data signals from the second end user device via a second communication channel that is separate from the first communication channel between the second end user device and the communication server;

executing a language recognition process to recognize the first spoken language of the communication represented by the audio data signals;

determining whether the first spoken language of the communication represented by the audio data signals corresponds to the language preference of the second end user device;

in response to determining that the first spoken language does not correspond to the language preference of the second end user device, translating the audio data signals in the first spoken language into audio data signals in a second spoken language, the second spoken language corresponding to the language preference of the second end user device; and

transmitting the translated audio data signals to the second end user device.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The disclosed subject matter provides a system, computer readable storage medium, and a method providing an audio and textual transcript of a communication. A conferencing services may receive audio or audio visual signals from a plurality of different devices that receive voice communications from participants in a communication, such as a chat or teleconference. The audio signals representing voice (speech) communications input into respective different devices by the participants. A translation services server may receive over a separate communication channel the audio signals for translation into a second language. As managed by the translation services server, the audio signals may be converted into textual data. The textual data may be translated into text of different languages based the language preferences of the end user devices in the teleconference. The translated text may be further translated into audio signals.

Citations

12 Claims

1. A system comprising:
- a communication server configured to communicate with two or more end user devices, each including an audio input device and an audio output device, and the communication server is further configured to perform operations comprising;
  
  receiving audio data signals representing communication in a first spoken language from a first end user device; and
  
  transmitting the audio data signals to a second end user device via a first communication channel, the audio data signals when received by the second end user device causing the second end user device to embed a language preference for the second end user device into the audio data signals; and
  
  a translation services server configured to communicate with the end user devices and perform operations comprising;
  
  after the second end user device embeds the language preference into the audio data signals, receiving the audio data signals from the second end user device via a second communication channel that is separate from the first communication channel between the second end user device and the communication server;
  
  executing a language recognition process to recognize the first spoken language of the communication represented by the audio data signals;
  
  determining whether the first spoken language of the communication represented by the audio data signals corresponds to the language preference of the second end user device;
  
  in response to determining that the first spoken language does not correspond to the language preference of the second end user device, translating the audio data signals in the first spoken language into audio data signals in a second spoken language, the second spoken language corresponding to the language preference of the second end user device; and
  
  transmitting the translated audio data signals to the second end user device.
- View Dependent Claims (2, 3, 4)
- - 2. The system of claim 1, further comprising a speech-to-text processor configured to convert the first spoken language audio data signals into first language text corresponding to the communication in the first spoken language.
  - 3. The system of claim 2, further comprising a translation processor configured to translate the first language text into second language text.
  - 4. The system of claim 3, further comprising a text-to-speech processor configured to convert the second language text into audio data signals representing a spoken version of the second language text.

5. A method comprising:
- receiving, at a translation services server, audio data signals from a first end user device via a first communication channel, the audio data signals representing communication in a first spoken language and having an embedding of a language preference for the first end user device, wherein the first end user device is configured to;
  
  receive the audio data signals from a communication server via a second communication channel that is separate from the first communication channel between the first end user device and the translation services server, the communication server receiving the audio data signals representing the communication in the first spoken language from a second end user device; and
  
  embed the language preference for the first end user device into the audio data signals received from the communication server;
  
  executing, by the translation services server, a language recognition process to recognize the first spoken language of the communication represented by the audio data signals;
  
  determining, by the translation services server, whether the first spoken language of the communication represented by the audio data signals corresponds to the language preference of the first end user device;
  
  in response to determining that the first spoken language does not correspond to the language preference of the first end user device, translating, by the translation services server, the audio data signals in the first spoken language into audio data signals in a second spoken language, the second spoken language corresponding to the language preference of the first end user device; and
  
  transmitting the translated audio data signals from the translation services server to the first end user device.
- View Dependent Claims (6, 7, 8)
- - 6. The method of claim 5, further comprising sending the first spoken language audio data signals from the translation services server to a speech-to-text processor configured to convert the first spoken language audio data signals into first language text corresponding to the communication in the first spoken language.
  - 7. The method of claim 6, further comprising sending the first language text from the translation services server to a translation processor configured to translate the first language text into second language text.
  - 8. The method of claim 7, further comprising sending the second language text from the translation services server to a text-to-speech processor configured to convert the second language text into audio data signals representing a spoken version of the second language text.

9. A system comprising:
- data processing hardware; and
  
  memory hardware in communication with the data processing hardware, the memory hardware storing instructions that when executed on the data processing hardware cause the data processing hardware to perform operations comprising executing a communication service and a translation services;
  
  wherein the communication service is configured to communicate with two or more end user devices each including an audio input device and an audio output device, and the communication service is further configured to perform operations comprising;
  
  receiving audio data signals representing communication in a first spoken language from a first end user device; and
  
  transmitting the audio data signals to a second end user device via a first communication channel, the audio data signals when received by the second end user device causing the second end user device to embed a language preference for the second end user device into the audio data signals; and
  
  wherein the translation service is configured to communicate with the end user devices and perform operations comprising;
  
  after the second end user device embeds the language preference into the audio data signals, receiving the audio data signals from the second end user device via a second communication channel that is separate from the first communication channel between the second end user device and the communication service;
  
  executing a language recognition process to recognize the first spoken language of the communication represented by the audio data signals;
  
  determining whether the first spoken language of the communication represented by the audio data signals corresponds to the language preference of the second end user device;
  
  in response to determining that the first spoken language does not correspond to the language preference of the second end user device, translating the audio data signals in the first spoken language into audio data signals in a second spoken language, the second spoken language corresponding to the language preference of the second end user device; and
  
  transmitting the translated audio data signals to the second end user device.
- View Dependent Claims (10, 11, 12)
- - 10. The system of claim 9, further comprising a speech-to-text processor configured to convert the first spoken language audio data signals into first language text corresponding to the communication in the first spoken language.
  - 11. The system of claim 10, further comprising a translation processor configured to translate the first language text into second language text.
  - 12. The system of claim 11, further comprising a text-to-speech processor configured to convert the second language text into audio data signals representing a spoken version of the second language text.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google LLC (Alphabet Inc.)
Inventors
Kristjansson, Trausti, Huang, John, Lin, Yu-Kuan, Tyan, Hung-ying, Uszkoreit, Jakob David, Estelle, Joshua James, Wang, Chung-yih, Buryak, Kirill, Konishi, Yusuke
Primary Examiner(s)
Colucci, Michael C

Application Number

US15/664,706
Publication Number

US 20170357643A1
Time in Patent Office

736 Days
Field of Search

704235, 704201, 7042701, 704277, 37920201, 379 8814, 715733, 715739, 715719, 4554141, 455406, 725 74, 709203, 370352
US Class Current
CPC Class Codes

G06F 40/58   Use of machine translation,...

G10L 13/00   Speech synthesis; Text to s...

G10L 15/26   Speech to text systems G10L...

H04M 2203/2061   Language aspects

H04M 2242/12   Language recognition, selec...

H04M 3/56   Arrangements for connecting...

H04M 3/568   audio processing specific t...

H04N 7/15   Conference systems

H04N 7/152   Multipoint control units th...

Auto-translation for multi user audio and video

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

Auto-translation for multi user audio and video

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links