AUTO-TRANSLATION FOR MULTI USER AUDIO AND VIDEO

US 20190332679A1
Filed: 07/09/2019
Published: 10/31/2019
Est. Priority Date: 12/12/2011
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving, at data processing hardware, an output data stream from a user device, the output data stream comprising a language preference indicator and first audio signals representing speech in a first language, the language preference indicator comprising a target language specified by a user of the user device for translating the speech in the first language;

converting, by the data processing hardware, the first audio signals into text in the first language;

translating, by the data processing hardware, the text in the first language into text in the target language using the language preference indicator;

converting, by the data processing hardware, the text in the target language into second audio signals representing a spoken version of the text in the target language; and

transmitting, by the data processing hardware, the second audio signals representing the speech in the target language to the user device.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The disclosed subject matter provides a system, computer readable storage medium, and a method providing an audio and textual transcript of a communication. A conferencing services may receive audio or audio visual signals from a plurality of different devices that receive voice communications from participants in a communication, such as a chat or teleconference. The audio signals representing voice (speech) communications input into respective different devices by the participants. A translation services server may receive over a separate communication channel the audio signals for translation into a second language. As managed by the translation services server, the audio signals may be converted into textual data. The textual data may be translated into text of different languages based the language preferences of the end user devices in the teleconference. The translated text may be further translated into audio signals.

1 Citation

20 Claims

1. A method comprising:
- receiving, at data processing hardware, an output data stream from a user device, the output data stream comprising a language preference indicator and first audio signals representing speech in a first language, the language preference indicator comprising a target language specified by a user of the user device for translating the speech in the first language;
  
  converting, by the data processing hardware, the first audio signals into text in the first language;
  
  translating, by the data processing hardware, the text in the first language into text in the target language using the language preference indicator;
  
  converting, by the data processing hardware, the text in the target language into second audio signals representing a spoken version of the text in the target language; and
  
  transmitting, by the data processing hardware, the second audio signals representing the speech in the target language to the user device.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, wherein translating the text in the first language into text in the target language comprises using statistical machine translation to produce a translation in the target language from the text in the first language.
  - 3. The method of claim 1, wherein translating the text in the first language into text in the target language comprises using rules-based machine translation to produce a translation in the target language form the text in the first language.
  - 4. The method of claim 1, wherein the data processing hardware resides on a remote platform in communication with the user device via a wide area network.
  - 5. The method of claim 1, wherein the user device is configured to audibly output the received second audio signals as synthesized speech in the target language.
  - 6. The method of claim 1, wherein the user device is configured to:
    - establish a communication channel with a remote server implementing the data processing hardware; and
      
      transmit the output data stream over the communication channel to the remote server.
  - 7. The method of claim 1, further comprising, after converting the first audio signals into the text in the first language, transmitting, by the data processing hardware, the text in the first language to the user device.
  - 8. The method of claim 7, wherein the user device is configured to display the text in the first language on a display of the user device.
  - 9. The method of claim 1, further comprising, when transmitting the second audio signals representing the speech in the target language to the user device, transmitting at least one of the text in the first language or the text in the target language to the user device.
  - 10. The method of claim 1, wherein the user device comprises a microphone configured to capture speech spoken by the user and a speaker configured to output audio.

11. A system comprising:
- data processing hardware; and
  
  memory hardware in communication with the data processing hardware and storing instructions that when executed by the data processing hardware cause the data processing hardware to perform operations comprising;
  
  receiving an output data stream from a user device, the output data stream comprising a language preference indicator and first audio signals representing speech in a first language, the language preference indicator comprising a target language specified by a user of the user device for translating the speech in the first language;
  
  converting the first audio signals into text in the first language;
  
  translating the text in the first language into text in the target language using the language preference indicator;
  
  converting the text in the target language into second audio signals representing a spoken version of the text in the target language; and
  
  transmitting the second audio signals representing the speech in the target language to the user device.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 12. The system of claim 11, wherein translating the text in the first language into text in the target language comprises using statistical machine translation to produce a translation in the target language from the text in the first language.
  - 13. The system of claim 11, wherein translating the text in the first language into text in the target language comprises using rules-based machine translation to produce a translation in the target language form the text in the first language.
  - 14. The system of claim 11, wherein the data processing hardware resides on a remote platform in communication with the user device via a wide area network.
  - 15. The system of claim 11, wherein the user device is configured to audibly output the received second audio signals as synthesized speech in the target language.
  - 16. The system of claim 11, wherein the user device is configured to:
    - establish a communication channel with a remote server implementing the data processing hardware; and
      
      transmit the output data stream over the communication channel to the remote server.
  - 17. The system of claim 11, wherein the operations further comprise, after converting the first audio signals into the text in the first language, transmitting the text in the first language to the user device.
  - 18. The system of claim 17, wherein the user device is configured to display the text in the first language on a display of the user device.
  - 19. The system of claim 11, wherein the operations further comprise, when transmitting the second audio signals representing the speech in the target language to the user device, transmitting at least one of the text in the first language or the text in the target language to the user device.
  - 20. The system of claim 11, wherein the user device comprises a microphone configured to capture speech spoken by the user and a speaker configured to output audio.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google LLC (Alphabet Inc.)
Inventors
Kristjansson, Trausti, Huang, John, Lin, Yu-Kuan, Tyan, Hung-ying, Uszkoreit, Jakob David, Estelle, Joshua James, Wang, Chung-yi, Buryak, Kirill, Konishi, Yusuke

Granted Patent

US 10,614,173 B2
Time in Patent Office

Days
Field of Search
US Class Current
CPC Class Codes

G06F 40/58   Use of machine translation,...

G10L 13/00   Speech synthesis; Text to s...

G10L 15/26   Speech to text systems G10L...

H04M 2203/2061   Language aspects

H04M 2242/12   Language recognition, selec...

H04M 3/56   Arrangements for connecting...

H04M 3/568   audio processing specific t...

H04N 7/15   Conference systems

H04N 7/152   Multipoint control units th...

AUTO-TRANSLATION FOR MULTI USER AUDIO AND VIDEO

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

1 Citation

20 Claims

Specification

Solutions

Use Cases

Quick Links

AUTO-TRANSLATION FOR MULTI USER AUDIO AND VIDEO

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

1 Citation

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links