Augmented Reality Language Translation
First Claim
1. A translation system for converting an utterance to text, the system comprising:
- a camera for capturing one or more frames comprising one or more potential speakers;
a microphone for capturing an utterance; and
a processor configured to detect the position of the one or more potential speakers with respect to a user and assign one of the potential speakers to the utterance;
wherein the processor is further configured to convert the captured utterance to text and transmit the converted text to a display for superimposing the converted text over the user'"'"'s field of view at a position relative to the assigned speaker'"'"'s position within the user'"'"'s field of view.
1 Assignment
0 Petitions
Accused Products
Abstract
Described herein are systems, devices, and methods for translating an utterance into text for display to a user. The approximate location of one or more potential speakers can be determined and a detected utterance can be assigned to one of the potential speakers based, at least in part, on a temporal relationship between the commencement of lip movement by one of the potential speakers and the reception of the utterance. The utterance can be converted to text and, if necessary, translated from a source language to a destination language. The converted text can then be displayed to the user in an augmented reality environment such that the user can intuitively appreciate to which of the potential speakers the converted text should be attributed.
51 Citations
20 Claims
-
1. A translation system for converting an utterance to text, the system comprising:
-
a camera for capturing one or more frames comprising one or more potential speakers; a microphone for capturing an utterance; and a processor configured to detect the position of the one or more potential speakers with respect to a user and assign one of the potential speakers to the utterance; wherein the processor is further configured to convert the captured utterance to text and transmit the converted text to a display for superimposing the converted text over the user'"'"'s field of view at a position relative to the assigned speaker'"'"'s position within the user'"'"'s field of view. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A translation system for presenting translated text to a user, the system comprising:
-
a camera configured to capture one or more images; a microphone configured to capture an utterance; a processor configured to detect the position of a face within the one or more images and translate the utterance from a source language to a destination language text; and a display configured to display the one or more images and the destination language text, the destination language text being positioned relative to the detected face. - View Dependent Claims (9, 10, 11, 12, 13)
-
-
14. A non-transitory, computer-readable medium containing instructions that, when executed by a processor, perform a method comprising:
-
receiving video comprising one or more potential speakers; receiving a first utterance made by one of the potential speakers; assigning the first utterance to a first speaker of the potential speakers; converting the first utterance to first text; transmitting the video for display to a user; and transmitting the first text for display to the user such that the first text is superimposed over the video at a position relative to the position of the first speaker within the video. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification