Language translation of visual and audio input

US 8,645,121 B2
Filed: 12/28/2012
Issued: 02/04/2014
Est. Priority Date: 03/29/2007
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving audio input;

receiving visual input comprising a captured image of a target scene; and

translating the audio input from a first language to a second language based upon a contextual hint, not indicative of the first language, determined based upon a non-textual element identified based upon the visual input.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present translation system translates visual input and/or audio input from one language into another language. Some implementations incorporate a context-based translation that uses information obtained from visual input or audio input to aid in the translation of the other input. Other implementations combine the visual and audio translation. The translation system includes visual components and/or audio components. The visual components analyze visual input to identify a textual element and translate the textual element into a translated textual element. The visual image represents a captured image of a target scene. The visual components may further substitute the translated textual element for the textual element in the captured image. The audio components convert audio input into translated audio.

40 Citations

View as Search Results

20 Claims

1. A method comprising:
- receiving audio input;
  
  receiving visual input comprising a captured image of a target scene; and
  
  translating the audio input from a first language to a second language based upon a contextual hint, not indicative of the first language, determined based upon a non-textual element identified based upon the visual input.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, the method performed at least in part by glasses.
  - 3. The method of claim 1, the second language different than the first language.
  - 4. The method of claim 1, comprising extracting the non-textual element using a scale invariant feature transformation.
  - 5. The method of claim 1, comprising analyzing the visual input to identify a textual element and translating the textual element.
  - 6. The method of claim 1, the method performed at least in part by a camera.
  - 7. The method of claim 1, comprising translating the audio input based upon a second contextual hint determined based upon a second non-textual element identified based upon received second audio input.

8. A system comprising:
- one or more processing units; and
  
  memory comprising instructions that when executed by at least one of the one or more processing units, perform a method comprising;
  
  receiving audio input;
  
  receiving visual input comprising a captured image; and
  
  translating the audio input from a first language to a second language based upon a contextual hint, not indicative of the first language, determined based upon a non-textual element identified based upon the visual input.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The system of claim 8, the system comprised at least partially in glasses.
  - 10. The system of claim 8, the second language different than the first language.
  - 11. The system of claim 8, the method comprising extracting the non-textual element using a scale invariant feature transformation.
  - 12. The system of claim 8, the method comprising analyzing the visual input to identify a textual element.
  - 13. The system of claim 12, the method comprising translating the textual element.
  - 14. The system of claim 8, the visual input corresponding to a target scene.

15. A computer-readable storage medium comprising instructions which when executed perform actions, comprising:
- receiving audio input;
  
  receiving visual input comprising a captured image of a target scene;
  
  analyzing the visual input to identify a non-textual element; and
  
  translating the audio input from a first language to a second language based upon a contextual hint, not indicative of the first language, determined based upon the non-textual element.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The computer-readable storage medium of claim 15, the actions performed at least in part by glasses.
  - 17. The computer-readable storage medium of claim 15, the second language different than the first language.
  - 18. The computer-readable storage medium of claim 15, the actions comprising extracting the non-textual element using a scale invariant feature transformation.
  - 19. The computer-readable storage medium of claim 15, the actions comprising analyzing the visual input to identify a textual element.
  - 20. The computer-readable storage medium of claim 19, the actions comprising translating the textual element.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
Microsoft Corporation
Inventors
Pathak, Binay, Boyd, Jonathan
Primary Examiner(s)
SPOONER, LAMONT M

Application Number

US13/729,921
Publication Number

US 20130185052A1
Time in Patent Office

403 Days
Field of Search

704 2- 8, 704/231, 704/251, 715/264
US Class Current

704/2
CPC Class Codes

G06F 40/40 Processing or translation o...

G06F 40/58 Use of machine translation,...

Language translation of visual and audio input

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

40 Citations

20 Claims

Specification

Use Cases

Quick Links

Others

Language translation of visual and audio input

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

40 Citations

20 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others