Apparatus, systems and methods for presenting text identified in a video image

US 8,704,948 B2
Filed: 01/18/2012
Issued: 04/22/2014
Est. Priority Date: 01/18/2012
Status: Active Grant

First Claim

Patent Images

1. A method of presenting text identified in a presented video image of a media content event, the method comprising:

receiving a complete video frame that is associated with a presented video image of a captured scene of a video content event, wherein the presented video image includes text disposed on an object that has been captured in the scene;

finding the text on the object that is part of the captured scene in the complete video frame;

using an optical character recognition (OCR) algorithm to translate the found text on the object into translated text; and

presenting the translated text associated with the text on the object that is part of the captured scene.

View all claims

6 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and methods are operable to present text identified in a presented video image of a media content event. An exemplary embodiment receives a complete video frame that is associated with a presented video image of a video content event, wherein the presented video image includes a region of text; finds the text in the complete video frame; uses an optical character recognition (OCR) algorithm to translate the found text; and presents the translated text. The translated text may be presented on a display concurrently with the video image that is presented on the display. Alternatively, or additionally, the translated text may be presented as audible speech emitted from at least one speaker.

57 Citations

View as Search Results

20 Claims

1. A method of presenting text identified in a presented video image of a media content event, the method comprising:
- receiving a complete video frame that is associated with a presented video image of a captured scene of a video content event, wherein the presented video image includes text disposed on an object that has been captured in the scene;
  
  finding the text on the object that is part of the captured scene in the complete video frame;
  
  using an optical character recognition (OCR) algorithm to translate the found text on the object into translated text; and
  
  presenting the translated text associated with the text on the object that is part of the captured scene.
- View Dependent Claims (8, 9, 10, 12, 13, 14, 15, 16)
- - 8. The method of claim 1, further comprising:
    - communicating the complete video frame from a media device to a remote device, wherein finding the text in the complete video frame and using the OCR algorithm to translate the found text occurs at the remote device; and
      
      communicating the translated text from the remote device to the media device, wherein the media device presents the translated text.
  - 9. The method of claim 1, wherein presenting the translated text comprises:
    - presenting the translated text on a display concurrently with the video image.
  - 10. The method of claim 9, wherein presenting the translated text on the display comprises:
    - presenting the translated text in a banner that is presented on the display.
  - 12. The method of claim 1, wherein presenting the translated text comprises:
    - presenting the translated text as audible speech emitted from at least one speaker.
  - 13. The method of claim 1, wherein the found text is in a first language, and further comprising:
    - translating the translated text from the first language to a second language, wherein the translated text is presented in the second language.
  - 14. The method of claim 1, wherein finding the text in the complete video frame comprises:
    - determining a focus of the found text; and
      
      comparing the focus of the found text with a focus threshold,wherein the found text is translated by the OCR algorithm only if the focus of the found text exceeds the focus threshold.
  - 15. The method of claim 1, wherein finding the text in the complete video frame comprises:
    - determining a character height of a character of the found text; and
      
      comparing the character height of the character of the found text with a character height threshold,wherein the found text is translated by the OCR algorithm only if the character height of the character exceeds the character height threshold.
  - 16. The method of claim 1, wherein finding the text in the complete video frame comprises:
    - determining an orientation of the found text; and
      
      comparing the orientation of the found text with an orientation threshold,wherein the found text is translated by the OCR algorithm only if the orientation of the found text is less than the orientation threshold.

2. A method of presenting text identified in a presented video image of a media content event, the method comprising:
- receiving a complete video frame that is associated with a presented video image of a captured scene of a video content event, wherein the presented video image includes text that has been captured in the scene;
  
  finding the text in the complete video frame based on a text search region of the presented video image, wherein a location of the text search region is user specified based on a received signal from a remote control that initiates presentation of a pointer icon, wherein a location of the pointer icon defines a location of the text search region on the presented video image;
  
  using an optical character recognition (OCR) algorithm to translate the found text andpresenting the translated text.
- View Dependent Claims (3, 4, 5, 6, 7)
- - 3. The method of claim 2, further comprising:
    - receiving a first signal from the remote control that initiates presentation of the pointer icon on a display that is concurrently presenting the presented video image, wherein the location of the pointer icon is associated with the location of the text search region on the presented video image; and
      
      receiving a second signal from the remote control that adjusts the location of the pointer icon, wherein the adjusted location of the pointer icon adjusts the location of the text search region.
  - 4. The method of claim 3, wherein the pointer icon is presented at a point location on the display and overlays the presented video image, and wherein the location of the text search region on the presented video image is centered about the location of the pointer icon.
  - 5. The method of claim 3, wherein the pointer icon is presented at a region on the display and overlays the presented video image, and wherein a location of the region of the text search region on the presented video image is the same as the location of the pointer icon.
  - 6. The method of claim 2, wherein the remote control that initiates presentation of the pointer icon is a laser-based pointer device, and further comprising:
    - receiving a signal from the laser-based pointer device that emits a laser beam, wherein a path of the laser beam over the presented video image identifies a boundary of the text search region.
  - 7. The method of claim 2, wherein the remote control that initiates presentation of the pointer icon is a laser-based pointer device, and further comprising:
    - receiving a signal from the laser-based pointer device that emits a laser beam, wherein a point of the laser beam on the presented video image identifies the location of the text search region.

11. A method of presenting text identified in a presented video image of a media content event, the method comprising:
- receiving a complete video frame that is associated with the presented video image of a captured scene of a video content event, wherein the presented video image includes text that has been captured in the scene;
  
  finding the text in the complete video frame;
  
  using an optical character recognition (OCR) algorithm to translate the found text andpresenting the translated text on a display concurrently with the video image in a text balloon on the display at a location that overlays the presented video image, wherein a pointer portion of the text balloon indicates a location of the found text in the presented video image.

17. A media device, comprising:
- a media content stream interface that receives a media content event comprising a stream of video frames that are serially presented, wherein each video frame includes a video image of an object that is part of a captured scene of the media content event, wherein the object that is part of the captured scene includes text thereon;
  
  a presentation device interface that communicates the stream of video frames to a display of a media presentation device; and
  
  a processor system communicatively coupled to the media content stream interface and the presentation device interface, wherein the processor system is configured to;
  
  select a complete video frame from the received stream of video frames;
  
  find the text on the object in the video image of the captured scene of the selected complete video frame;
  
  translate the found text on the object using an optical character recognition (OCR) algorithm into translated text; and
  
  communicate the translated text to the display via the presentation device interface,wherein the translated text associated with the text on the object that is part of the captured scene is presented on the display.

18. A media device, comprising:
- a media content stream interface that receives a media content event comprising stream of video frames that are serially presented, wherein each video frame includes a video image of a captured scene;
  
  a presentation device interface that communicates the stream of video frames to a display of a media presentation device;
  
  a remote interface that receives a signal from at least one of a remote control and a remote device; and
  
  a processor system communicatively coupled to the media content stream interface, the remote interface and the presentation device interface, wherein the processor system is configured to;
  
  select a complete video frame from the received stream of video frames;
  
  initiate presentation of a pointer icon in response to receiving a first signal from the remote control or the remote device, wherein a location of the pointer icon is associated with a location of a text search region on the presented video image;
  
  find text in the text search region of the video image of the selected complete video frame;
  
  translate the found text using an optical character recognition (OCR) algorithm;
  
  communicate the translated text to the display via the presentation device interface; and
  
  adjust the location of the pointer icon in response to receiving a second signal from the remote control or the remote device, wherein the adjusted location of the pointer icon adjusts location of the text search region.

19. A method of operating a media device, the method comprising:
- presenting a video image of a captured scene of a media content event, wherein the captured scene includes an object that is part of the captured scene, wherein the object has visible text thereon;
  
  receiving an input that activates a text translation mode of operation of the media device;
  
  selecting a complete video frame corresponding to the video image in response to activation of the text translation mode of operation;
  
  finding the text on the object in the complete video frame based on a text search region of the presented video image, wherein the text search region encompasses at least part of the text;
  
  using an optical character recognition (OCR) algorithm to translate the found text on the object into translated text; and
  
  presenting the translated text associated with the text on the object that is part of the captured scene.
- View Dependent Claims (20)
- - 20. The method of claim 19, further comprising:
    - receiving a user input from a remote device; and
      
      determining a location of the text search region of the presented video image based upon the user input.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Dish Technologies LLC (Dish Network Corporation)
Original Assignee
Eldon Technology Limited (Dish Network Corporation)
Inventors
Mountain, Dale
Primary Examiner(s)
KOSTAK, VICTOR R

Application Number

US13/353,160
Publication Number

US 20130182182A1
Time in Patent Office

825 Days
Field of Search

348/553, 348/569, 348/564, 348/734, 348/62, 348/63, 348/576, 345/672, 704/2, 704/3, 704/8, 704/9, 704/276, 704/7, 704/277, 704/260
US Class Current

348/564
CPC Class Codes

G06V 20/20   in augmented reality scenes

G06V 30/10   Character recognition

H04N 21/44008   involving operations for an...

H04N 21/4728   for selecting a Region Of I...

Apparatus, systems and methods for presenting text identified in a video image

First Claim

6 Assignments

0 Petitions

Accused Products

Abstract

57 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Apparatus, systems and methods for presenting text identified in a video image

First Claim

6 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

57 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links