Auto-translation for multi user audio and video
First Claim
1. A system comprising:
- a communication server configured to communicate with two or more end user devices, each including an audio input device and an audio output device, and the communication server is further configured to perform operations comprising;
receiving audio data signals representing communication in a first spoken language from a first end user device; and
transmitting the audio data signals to a second end user device via a first communication channel, the audio data signals when received by the second end user device causing the second end user device to embed a language preference for the second end user device into the audio data signals; and
a translation services server configured to communicate with the end user devices and perform operations comprising;
after the second end user device embeds the language preference into the audio data signals, receiving the audio data signals from the second end user device via a second communication channel that is separate from the first communication channel between the second end user device and the communication server;
executing a language recognition process to recognize the first spoken language of the communication represented by the audio data signals;
determining whether the first spoken language of the communication represented by the audio data signals corresponds to the language preference of the second end user device;
in response to determining that the first spoken language does not correspond to the language preference of the second end user device, translating the audio data signals in the first spoken language into audio data signals in a second spoken language, the second spoken language corresponding to the language preference of the second end user device; and
transmitting the translated audio data signals to the second end user device.
3 Assignments
0 Petitions
Accused Products
Abstract
The disclosed subject matter provides a system, computer readable storage medium, and a method providing an audio and textual transcript of a communication. A conferencing services may receive audio or audio visual signals from a plurality of different devices that receive voice communications from participants in a communication, such as a chat or teleconference. The audio signals representing voice (speech) communications input into respective different devices by the participants. A translation services server may receive over a separate communication channel the audio signals for translation into a second language. As managed by the translation services server, the audio signals may be converted into textual data. The textual data may be translated into text of different languages based the language preferences of the end user devices in the teleconference. The translated text may be further translated into audio signals.
-
Citations
12 Claims
-
1. A system comprising:
-
a communication server configured to communicate with two or more end user devices, each including an audio input device and an audio output device, and the communication server is further configured to perform operations comprising; receiving audio data signals representing communication in a first spoken language from a first end user device; and transmitting the audio data signals to a second end user device via a first communication channel, the audio data signals when received by the second end user device causing the second end user device to embed a language preference for the second end user device into the audio data signals; and a translation services server configured to communicate with the end user devices and perform operations comprising; after the second end user device embeds the language preference into the audio data signals, receiving the audio data signals from the second end user device via a second communication channel that is separate from the first communication channel between the second end user device and the communication server; executing a language recognition process to recognize the first spoken language of the communication represented by the audio data signals; determining whether the first spoken language of the communication represented by the audio data signals corresponds to the language preference of the second end user device; in response to determining that the first spoken language does not correspond to the language preference of the second end user device, translating the audio data signals in the first spoken language into audio data signals in a second spoken language, the second spoken language corresponding to the language preference of the second end user device; and transmitting the translated audio data signals to the second end user device. - View Dependent Claims (2, 3, 4)
-
-
5. A method comprising:
-
receiving, at a translation services server, audio data signals from a first end user device via a first communication channel, the audio data signals representing communication in a first spoken language and having an embedding of a language preference for the first end user device, wherein the first end user device is configured to; receive the audio data signals from a communication server via a second communication channel that is separate from the first communication channel between the first end user device and the translation services server, the communication server receiving the audio data signals representing the communication in the first spoken language from a second end user device; and embed the language preference for the first end user device into the audio data signals received from the communication server; executing, by the translation services server, a language recognition process to recognize the first spoken language of the communication represented by the audio data signals; determining, by the translation services server, whether the first spoken language of the communication represented by the audio data signals corresponds to the language preference of the first end user device; in response to determining that the first spoken language does not correspond to the language preference of the first end user device, translating, by the translation services server, the audio data signals in the first spoken language into audio data signals in a second spoken language, the second spoken language corresponding to the language preference of the first end user device; and transmitting the translated audio data signals from the translation services server to the first end user device. - View Dependent Claims (6, 7, 8)
-
-
9. A system comprising:
-
data processing hardware; and memory hardware in communication with the data processing hardware, the memory hardware storing instructions that when executed on the data processing hardware cause the data processing hardware to perform operations comprising executing a communication service and a translation services; wherein the communication service is configured to communicate with two or more end user devices each including an audio input device and an audio output device, and the communication service is further configured to perform operations comprising; receiving audio data signals representing communication in a first spoken language from a first end user device; and transmitting the audio data signals to a second end user device via a first communication channel, the audio data signals when received by the second end user device causing the second end user device to embed a language preference for the second end user device into the audio data signals; and wherein the translation service is configured to communicate with the end user devices and perform operations comprising; after the second end user device embeds the language preference into the audio data signals, receiving the audio data signals from the second end user device via a second communication channel that is separate from the first communication channel between the second end user device and the communication service; executing a language recognition process to recognize the first spoken language of the communication represented by the audio data signals; determining whether the first spoken language of the communication represented by the audio data signals corresponds to the language preference of the second end user device; in response to determining that the first spoken language does not correspond to the language preference of the second end user device, translating the audio data signals in the first spoken language into audio data signals in a second spoken language, the second spoken language corresponding to the language preference of the second end user device; and transmitting the translated audio data signals to the second end user device. - View Dependent Claims (10, 11, 12)
-
Specification