Image-based character recognition

US 9,390,340 B2
Filed: 05/26/2015
Issued: 07/12/2016
Est. Priority Date: 11/29/2012
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method, comprising:

obtaining an image captured by a camera of a computing device;

analyzing the image to locate a region of text represented in the image;

binarizing the region of text to generate a binarized region;

recognizing text of the binarized region with a first optical character recognition (OCR) engine and a second (OCR) engine to identify first recognized text and second recognized text;

tuning a first processing speed associated with the first (OCR) engine and a second processing speed associated with the second OCR engine such that the first processing speed and the second processing speed are equal to within a predefined deviation;

determining a first confidence score for the first recognized text and a second confidence score for the second recognized text by;

searching a database for matching words within the first recognized text and the second recognized text;

increasing the first confidence score for the first recognized text based on matching a string of characters in the first recognized text or the second recognized text to at least one first word in the database; and

increasing the second confidence for the second recognized text based on matching a second string of characters in the second recognized text to at least one of the first word or a second word in the database; and

applying a combination function of the first confidence score and the second confidence score to generate a consensus string of text, the consensus string of text comprising at least a portion of at least one of the first recognized text or the second recognized text.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Various embodiments enable a device to perform tasks such as processing an image to recognize and locate text in the image, and providing the recognized text an application executing on the device for performing a function (e.g., calling a number, opening an internet browser, etc.) associated with the recognized text. In at least one embodiment, processing the image includes substantially simultaneously or concurrently processing the image with at least two recognition engines, such as at least two optical character recognition (OCR) engines, running in a multithreaded mode. In at least one embodiment, the recognition engines can be tuned so that their respective processing speeds are roughly the same. Utilizing multiple recognition engines enables processing latency to be close to that of using only one recognition engine.

Citations

17 Claims

1. A computer-implemented method, comprising:
- obtaining an image captured by a camera of a computing device;
  
  analyzing the image to locate a region of text represented in the image;
  
  binarizing the region of text to generate a binarized region;
  
  recognizing text of the binarized region with a first optical character recognition (OCR) engine and a second (OCR) engine to identify first recognized text and second recognized text;
  
  tuning a first processing speed associated with the first (OCR) engine and a second processing speed associated with the second OCR engine such that the first processing speed and the second processing speed are equal to within a predefined deviation;
  
  determining a first confidence score for the first recognized text and a second confidence score for the second recognized text by;
  
  searching a database for matching words within the first recognized text and the second recognized text;
  
  increasing the first confidence score for the first recognized text based on matching a string of characters in the first recognized text or the second recognized text to at least one first word in the database; and
  
  increasing the second confidence for the second recognized text based on matching a second string of characters in the second recognized text to at least one of the first word or a second word in the database; and
  
  applying a combination function of the first confidence score and the second confidence score to generate a consensus string of text, the consensus string of text comprising at least a portion of at least one of the first recognized text or the second recognized text.
- View Dependent Claims (2, 3, 4)
- - 2. The computer-implemented method of claim 1, wherein the first processing speed and the second processing speed are tuned to reduce a processing latency associated with a combination of the first OCR engine and the second OCR engine.
  - 3. The computer-implemented method of claim 1, further comprising:
    - determining that the combination function of the first confidence score and the second confidence score is below a threshold; and
      
      processing the binarized region with a third OCR engine.
  - 4. The computer-implemented method of claim 1, further comprising:
    - applying a first bounding box to the binarized region;
      
      applying a second bounding box to the binarized region; and
      
      aligning the first bounding box of the first recognized text from the first OCR engine with the second bounding box of the second recognized text from the at least a second OCR engine.

5. A computer-implemented method, comprising:
- analyzing an image to locate a region of text in the image;
  
  selecting a subset of the image associated with the region of text represented in the image;
  
  the subset of the image to generate a binarized region;
  
  tuning a first processing speed associated with a first optical character recognition (OCR) engine and a second processing speed associated with a second OCR engine such that the first processing speed and the second processing speed are equal to within a predefined deviation;
  
  recognizing the text represented in the image with the first OCR engine to yield a first recognized text;
  
  recognizing the text represented in the image with the second OCR engine to yield a second recognized text, the first OCR engine being different relative to the second OCR engine;
  
  determining a first confidence score for the first recognized text and a second confidence score for the second recognized text by;
  
  searching a database for matching words within the first recognized text and the second recognized text;
  
  increasing the first confidence score for the first recognized text based on matching a string of characters in the first recognized text or the second recognized text to at least one first word in the database; and
  
  increasing the second confidence for the second recognized text based on matching a second string of characters in the second recognized text to at least one of the first word or a second word in the database; and
  
  applying a combination function of the first confidence score and the second confidence score to generate a consensus string of text comprising at least a portion of at least one of the first recognized text from the first OCR engine or the second recognized text from the second OCR engine.
- View Dependent Claims (6, 7, 8, 9, 10, 11)
- - 6. The computer-implemented method of claim 5, further comprising:
    - recognizing second text represented in a second image with the first OCR engine to yield a third recognized text;
      
      recognizing the second text in the second image with the second OCR engine to yield fourth recognized text;
      
      determining a third confidence score for the third recognized text from the first OCR engine and a fourth confidence score for the fourth recognized text from the second OCR engine; and
      
      applying a combination function of the first confidence score, the second confidence score, the third confidence score, and the fourth confidence score to generate the consensus string of text comprising at least a portion of at least one of the first recognized text, the second recognized text, the third recognized text, or the fourth recognized text.
  - 7. The computer-implemented method of claim 5, wherein the first processing speed and the second processing speed are tuned to reduce a processing latency associated with a combination of the first OCR engine and the second OCR engine.
  - 8. The computer-implemented method of claim 5, further comprising:
    - determining that the combination function of the first confidence score and the second confidence score is below a threshold; and
      
      processing the binarized region with a third OCR engine.
  - 9. The computer-implemented method of claim 5, further comprising:
    - applying a first bounding box to the binarized region;
      
      applying a second bounding box to the binarized region; and
      
      aligning the first bounding box of the first recognized text from the first OCR engine with the second bounding box of the second recognized text from the second OCR engine.
  - 10. The computer-implemented method of claim 5, wherein the image is captured by at least one camera of a portable computing device.
  - 11. The computer-implemented method of claim 5, further comprising:
    - communicating the binarized region to the first OCR engine and the second OCR engine.

12. A computing device, comprising:
- a processor;
  
  a display screen; and
  
  memory including instructions that, when executed by the processor, cause the computing device to;
  
  analyze an image to locate a region of text in the image;
  
  select a subset of the image associated with the region of text represented in the image;
  
  binarize the subset of the image to generate a binarized region;
  
  tune a first processing speed associated with a first OCR engine and a second processing speed associated with a second OCR engine such that the first processing speed and the second processing speed are equal to within a predefined deviation;
  
  recognize the text represented in the image with the first OCR engine to yield a first recognized text;
  
  recognize the text represented in the image with the second OCR engine to yield a second recognized text, the first OCR engine being different relative to the second OCR engine;
  
  determine a first confidence score for the first recognized text and a second confidence score for the second recognized text by;
  
  searching a database for matching words within the first recognized text and the second recognized text;
  
  increasing the first confidence score for the first recognized text based on matching a string of characters in the first recognized text or the second recognized text to at least one first word in the database; and
  
  increasing the second confidence for the second recognized text based on matching a second string of characters in the second recognized text to at least one of the first word or a second word in the database; and
  
  apply a combination function of the first confidence score and the second confidence score to generate a consensus string of text comprising at least a portion of at least one of the first recognized text from the first OCR engine or the second recognized text from the second OCR engine.
- View Dependent Claims (13, 14, 15, 16, 17)
- - 13. The computing device of claim 12, wherein the memory includes instructions that, when executed by the processor, further cause the computing device to:
    - recognize second text represented in a second image with the first OCR engine to yield a third recognized text;
      
      recognize the second text represented in the second image with the second OCR engine to yield fourth recognized text;
      
      determine a third confidence score for the third recognized text from the first OCR engine and a fourth confidence score for the fourth recognized text from the second OCR engine; and
      
      apply a combination function of the first confidence score, the second confidence score, the third confidence score, and the fourth confidence score to generate the consensus string of text comprising at least a portion of at least one of the first recognized text, the second recognized text, the third recognized text, or the fourth recognized text.
  - 14. The computing device of claim 12, wherein the first processing speed and the second processing speed are tuned to reduce a processing latency associated with a combination of the first OCR engine and the second OCR engine.
  - 15. The computing device of claim 12, wherein the memory includes instructions that, when executed by the processor, further cause the computing device to:
    - determining that the combination function of the first confidence score and the second confidence score is below a threshold; and
      
      processing the binarized region with a third OCR engine.
  - 16. The computing device of claim 12, wherein the memory includes instructions that, when executed by the processor, further cause the computing device to:
    - apply a first bounding box to the binarized region;
      
      apply a second bounding box to the binarized region; and
      
      align the first bounding box of the first recognized text from the first OCR engine with the second bounding box of the second recognized text from the second OCR engine.
  - 17. The computing device of claim 12, further comprising at least one camera that captures the image in a capture mode, the capture mode including at least one of a single image capture mode, a multiple image capture mode, a periodic imaging capture mode, a continuous image capturing mode, and an image streaming capture mode.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
A9.com (Amazon.com, Inc.)
Original Assignee
A9.com (Amazon.com, Inc.)
Inventors
Lin, Xiaofan, Dhua, Arnab Sanat Kumar, Gray, Douglas Ryan, Lou, Yu
Primary Examiner(s)
Ly, Anh

Application Number

US14/721,696
Publication Number

US 20150254507A1
Time in Patent Office

413 Days
Field of Search

707/758, 707/728, 707/723, 707/E17.014, 382/257, 382/276, 382/181, 382/176, 382/168, 382/229, 382/182, 382/309, 382/217, 382/138, 382/107, 382/100, 382/305, 382/203, 382/154, 382/187
US Class Current

1/1
CPC Class Codes

G06F 16/5846   using extracted text

G06F 18/254   of classification results, ...

G06V 10/809   of classification results, ...

G06V 20/62   Text, e.g. of license plate...

G06V 30/10   Character recognition

G06V 30/224   of printed characters havin...

Image-based character recognition

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Image-based character recognition

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links