Systems and methods for displaying foreign character sets and their translations in real time on resource-constrained mobile devices

US 8,761,513 B1
Filed: 03/12/2014
Issued: 06/24/2014
Est. Priority Date: 03/15/2013
Status: Expired due to Fees

First Claim

Patent Images

1. A method for translating a video feed in real-time augmented reality from a first language to a second language using a mobile device comprising a video camera, a processor, a memory, and a display, the method comprising the steps of:

(a) capturing a frame in real-time from the video feed of one or more words in the first language which need to be translated using the video camera to produce a captured frame;

(b) cropping the captured frame to fit inside an image processing bounding box to produce a cropped frame;

(c) pre-processing the cropped frame to produce a pre-processed frame;

(d) performing character segment recognition on the pre-processed frame to produce a plurality of character segments;

(e) performing character merging on the character segments to produce a plurality of merged character segments;

(f) performing character recognition on the merged character segments to produce a recognized frame having a plurality of recognized characters;

(g) processing the recognized frame through a translation engine to produce a translation of the recognized characters in the first language into one or more words of the second language to produce a translated frame, while also calculating a translation quality representing how well the recognized characters have been translated for each translated frame;

(h) storing the translated frame to the memory as a current translated frame, wherein a previous translated frame and a previous translation quality is also stored in the memory;

(i) checking that the bounding box has stayed on a same set of characters for the current translated frame and the previous translated frame by determining a fraction of similar characters that are overlapping between the current translated frame and the previous translated frame, wherein a higher fraction indicates that the bounding box has stayed on the same set of characters for the current translated frame and the previous translated frame;

(j) comparing the translation quality determined by the translation engine for the current translated frame to the previous translation quality for the previous translated frame;

(k) selecting one of the previous translated frame and the current translated frame to be removed from the memory based on a frame having a lower translation quality; and

(l) displaying an optimal translated frame from the previous translated frame and the current translated frame, the optimal translated frame having a higher translation quality, wherein the words of the second language are overlaid over or next to the words in the first language which is being translated in an augmented reality on the display of the mobile device.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention is related to systems and methods for translating language text on a mobile camera device offline without access to the Internet. More specifically, the present invention relates to systems and methods for displaying text of a first language and a translation of the first language text into a second language text which is displayed in real time in augmented reality on the mobile device. The processing can use a single line or a multiline algorithm designed with a plurality of processing innovations to insure accurate real-time translations without motion jitter. The invention may be used to help travelers in a foreign country with difficulties in reading and understanding text written in the local language of that country. The present invention may be utilized with wearable computers or glasses, producing seamless augmented reality foreign language translations. Some embodiments are particularly useful in translations from Asian languages to English.

84 Citations

View as Search Results

30 Claims

1. A method for translating a video feed in real-time augmented reality from a first language to a second language using a mobile device comprising a video camera, a processor, a memory, and a display, the method comprising the steps of:
- (a) capturing a frame in real-time from the video feed of one or more words in the first language which need to be translated using the video camera to produce a captured frame;
  
  (b) cropping the captured frame to fit inside an image processing bounding box to produce a cropped frame;
  
  (c) pre-processing the cropped frame to produce a pre-processed frame;
  
  (d) performing character segment recognition on the pre-processed frame to produce a plurality of character segments;
  
  (e) performing character merging on the character segments to produce a plurality of merged character segments;
  
  (f) performing character recognition on the merged character segments to produce a recognized frame having a plurality of recognized characters;
  
  (g) processing the recognized frame through a translation engine to produce a translation of the recognized characters in the first language into one or more words of the second language to produce a translated frame, while also calculating a translation quality representing how well the recognized characters have been translated for each translated frame;
  
  (h) storing the translated frame to the memory as a current translated frame, wherein a previous translated frame and a previous translation quality is also stored in the memory;
  
  (i) checking that the bounding box has stayed on a same set of characters for the current translated frame and the previous translated frame by determining a fraction of similar characters that are overlapping between the current translated frame and the previous translated frame, wherein a higher fraction indicates that the bounding box has stayed on the same set of characters for the current translated frame and the previous translated frame;
  
  (j) comparing the translation quality determined by the translation engine for the current translated frame to the previous translation quality for the previous translated frame;
  
  (k) selecting one of the previous translated frame and the current translated frame to be removed from the memory based on a frame having a lower translation quality; and
  
  (l) displaying an optimal translated frame from the previous translated frame and the current translated frame, the optimal translated frame having a higher translation quality, wherein the words of the second language are overlaid over or next to the words in the first language which is being translated in an augmented reality on the display of the mobile device.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 2. The method of claim 1, wherein the first language is selected from the group consisting of Chinese, Korean, Japanese, Vietnamese, Khmer, Lao, That, English, French, Spanish, German, Italian, Portuguese, Russian, Hindi, Greek, Hebrew, and Arabic.
  - 3. The method of claim 1, wherein the first language is Chinese and the second language is English.
  - 4. The method of claim 1, further comprising:
    - utilizing a conversion table for converting dialects of the first language into a smaller number of dialects of the first language before translating the first language into the second language.
  - 5. The method of claim 1, further comprising:
    - utilizing a conversion table for converting traditional Chinese characters to simplified Chinese characters before translating the first language into the second language.
  - 6. The method of claim 1, wherein the second language is selected from the group consisting of Chinese, Korean, Japanese, Vietnamese, Khmer, Lao, That, English, French, Spanish, German, Italian, Portuguese, Russian, Hindi, Greek, Hebrew, and Arabic.
  - 7. The method of claim 1, further comprising:
    - selecting between a single line of the first language and multiple lines of the first language for translation into the second language by changing a text selection box size on the mobile device which displays the video feed of the first language.
  - 8. The method of claim 1, wherein a single line of the first language is translated into a single line of the second language.
  - 9. The method of claim 1, wherein multiple lines of the first language are translated into multiple lines of the second language.
  - 10. The method of claim 1, further comprising:
    - moving a second language translation when the mobile device is moved without recalculating the translation.
  - 11. The method of claim 1, further comprising:
    - pausing the translation which is displayed on the mobile device to allow a movement of the mobile device without changing displayed language translation.
  - 12. The method of claim 1, further comprising:
    - storing a paused language translation frame comprising the first language and the second language in the memory for later review.
  - 13. The method of claim 1, further comprising:
    - displaying a phonetic pronunciation of the one or more words of the first language being translated.
  - 14. The method of claim 1, wherein the translation quality is determined by how many and how well the one or more words of the first language are translated.

15. A mobile device for translating a video feed in real-time from a first language to a second language, the mobile device comprising:
- a video camera for capturing the video feed of one or more words in the first language which need translation;
  
  a display for displaying the words of the first language and the words of the second language in augmented reality;
  
  a processor for processing program code; and
  
  at least one memory operatively connected to the processor for storing the program code and one or more frames, which program code when executed by the processor causes the processor to execute a process to;
  
  (a) capture a frame in real-time from the video feed of one or more words in the first language which need to be translated using the video camera to produce a captured frame;
  
  (b) crop the captured frame to fit inside an image processing bounding box to produce a cropped frame;
  
  (c) pre-process the cropped frame to produce a pre-processed frame;
  
  (d) perform character segment recognition on the pre-processed frame to produce a plurality of character segments;
  
  (e) perform character merging on the character segments to produce a plurality of merged character segments;
  
  (f) perform character recognition on the merged character segments to produce a recognized frame having a plurality of recognized characters;
  
  (g) process the recognized frame through a translation engine to produce a translation of the recognized characters in the first language into one or more words of the second language to produce a translated frame, while also calculating a translation quality representing how well the recognized characters have been translated for each translated frame;
  
  (h) store the translated frame to the memory as a current translated frame, wherein a previous translated frame and a previous translation quality is also stored in the memory;
  
  (i) check that the bounding box has stayed on a same set of characters for the current translated frame and the previous translated frame by determining a fraction of similar characters that are overlapping between the current translated frame and the previous translated frame, wherein a higher fraction indicates that the bounding box has stayed on the same set of characters for the current translated frame and the previous translated frame;
  
  (j) compare the translation quality determined by the translation engine for the current translated frame to the previous translation quality for the previous translated frame;
  
  (k) select one of the previous translated frame and the current translated frame to be removed from the memory based on a frame having a lower translation quality; and
  
  (l) display an optimal translated frame from the previous translated frame and the current translated frame, the optimal translated frame having a higher translation quality, wherein the words of the second language are overlaid over or next to the words in the first language which is being translated in an augmented reality on the display of the mobile device.
- View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27)
- - 16. The mobile device of claim 15, wherein the mobile device is a smartphone.
  - 17. The mobile device of claim 15, wherein the mobile device is a tablet computer.
  - 18. The mobile device of claim 15, wherein the mobile device is a wearable computer.
  - 19. The mobile device of claim 15, wherein the mobile device is a wearable eye glass.
  - 20. The mobile device of claim 15, wherein the mobile device is a laptop computer.
  - 21. The mobile device of claim 15, wherein the first language is selected from the group consisting of Chinese, Korean, Japanese, Vietnamese, Khmer, Lao, That, English, French, Spanish, German, Italian, Portuguese, Russian, Hindi, Greek, Hebrew, and Arabic.
  - 22. The mobile device of claim 15, wherein the first language is Chinese and the second language is English.
  - 23. The mobile device of claim 15, wherein the memory comprises additional program code, which when executed by the processor causes the processor to:
    - utilize a conversion table for converting traditional Chinese characters to simplified Chinese characters before translating the first language into the second language.
  - 24. The mobile device of claim 15, wherein the second language is selected from the group consisting of Chinese, Korean, Japanese, Vietnamese, Khmer, Lao, That, English, French, Spanish, German, Italian, Portuguese, Russian, Hindi, Greek, Hebrew, and Arabic.
  - 25. The mobile device of claim 15, wherein the memory comprises additional program code, which when executed by the processor causes the processor to:
    - select between a single line of the first language and multiple lines of the first language for translation into the second language by changing a text selection box size on the mobile device which displays the video feed of the first language.
  - 26. The mobile device of claim 15, wherein the memory comprises additional program code, which when executed by the processor causes the processor to:
    - move the second language translation when the mobile device is moved without recalculating the translation.
  - 27. The mobile device of claim 15, wherein the translation quality is determined by how many and how well the one or more words of the first language are translated.

28. A non-transitory, computer-readable storage medium for storing program code for translating a video feed in real-time from a first language to a second language, the program code, when executed by a processor causes the processor to execute a translation process comprising:
- (a) a step for capturing a frame in real-time from the video feed of one or more words in the first language which need to be translated using a video camera to produce a captured frame;
  
  (b) a step for cropping the captured frame to fit inside an image processing bounding box to produce a cropped frame;
  
  (c) a step for pre-processing the cropped frame to produce a pre-processed frame;
  
  (d) a step for performing character segment recognition on the pre-processed frame to produce a plurality of character segments;
  
  (e) a step for performing character merging on the character segments to produce a plurality of merged character segments;
  
  (f) a step for performing character recognition on the merged character segments to produce a recognized frame having a plurality of recognized characters;
  
  (g) a step for processing the recognized frame through a translation engine to produce a translation of the recognized characters in the first language into one or more words of the second language to produce a translated frame, while also calculating a translation quality representing how well the recognized characters have been translated for each translated frame;
  
  (h) a step for storing the translated frame to a memory as a current translated frame, wherein a previous translated frame and a previous translation quality is also stored in the memory;
  
  (i) a step for checking that the bounding box has stayed on a same set of characters for the current translated frame and the previous translated frame by determining a fraction of similar characters that are overlapping between the current translated frame and the previous translated frame, wherein a higher fraction indicates that the bounding box has stayed on the same set of characters for the current translated frame and the previous translated frame;
  
  (j) a step for comparing the translation quality determined by the translation engine for the current translated frame to the previous translation quality for the previous translated frame;
  
  (k) a step for selecting one of the previous translated frame and the current translated frame to be removed from the memory based on a frame having a lower translation quality; and
  
  (l) a step for displaying an optimal translated frame from the previous translated frame and the current translated frame, the optimal translated frame having a higher translation quality, wherein the words of the second language are overlaid over or next to the words in the first language which is being translated in an augmented reality on a display.
- View Dependent Claims (29, 30)
- - 29. The storage medium of claim 28, wherein the first language is selected from the group consisting of Chinese, Korean, Japanese, Vietnamese, Khmer, Lao, That, English, French, Spanish, German, Italian, Portuguese, Russian, Hindi, Greek, Hebrew, and Arabic.
  - 30. The storage medium device of claim 28, wherein the second language is selected from the group consisting of Chinese, Korean, Japanese, Vietnamese, Khmer, Lao, That, English, French, Spanish, German, Italian, Portuguese, Russian, Hindi, Greek, Hebrew, and Arabic.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Translate Abroad, Inc.
Original Assignee
Translate Abroad, Inc.
Inventors
Rogowski, Ryan Leon, Wu, Huan-Yu, Clark, Kevin Anthony
Primary Examiner(s)
Li, Ruiping

Application Number

US14/207,155
Time in Patent Office

104 Days
Field of Search

382/181, 382/182, 382/135, 382/187
US Class Current

382/181
CPC Class Codes

G06F 18/00   Pattern recognition

G06F 40/263   Language identification

G06F 40/51   Translation evaluation

G06F 40/53   Processing of non-Latin tex...

G06F 40/58   Use of machine translation,...

G06V 10/28   Quantising the image, e.g. ...

G06V 20/20   in augmented reality scenes

G06V 30/287   of Kanji, Hiragana or Katak...

G09G 5/00   Control arrangements or cir...

G09G 5/246   of ideographic or arabic-li...

Systems and methods for displaying foreign character sets and their translations in real time on resource-constrained mobile devices

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

84 Citations

30 Claims

Specification

Solutions

Use Cases

Quick Links

Systems and methods for displaying foreign character sets and their translations in real time on resource-constrained mobile devices

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

84 Citations

30 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links