Leveraging image context for improved glyph classification

US 9,576,196 B1
Filed: 08/20/2014
Issued: 02/21/2017
Est. Priority Date: 08/20/2014
Status: Expired due to Fees

First Claim

Patent Images

1. A method, comprising:

identifying a first region of an image;

identifying a second region of the image;

extracting contextual features from the first region of the image, the contextual features comprising at least one of gradient or intensity patterns;

aggregating the contextual features;

quantizing the aggregated extracted features as quantized contextual features of the first region;

determining, using a first classifier, that the quantized contextual features are consistent with image data comprising a glyph;

determining that the first region contains a glyph;

determining that the second region does not contain a glyph;

stopping further processing of the second region;

identifying a candidate glyph corresponding to a maximally stable extremal region (MSER) in the first region of the image;

determining a plurality of glyph feature descriptors based on the candidate glyph, including one or more of determining the candidate glyph'"'"'s aspect ratio, compactness, solidity, stroke-width to width, stroke-width to height, convexity, raw compactness, or a number of holes included in the candidate glyph;

determining that the candidate glyph comprises a first glyph, using the determined glyph feature descriptors, the quantized contextual features, and a first model; and

performing optical character recognition (OCR) on the first glyph.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system to recognize text or symbols contained in a captured image using machine learning models leverages context information about the image to improve accuracy. Contextual information is determined for the entire image, or spatial regions of the images, and is provided to a machine learning model when a determination is made as to whether a region does or does not contain text or symbols. Associating features related to the larger context with features extracted from regions potentially containing text or symbolic content provides an incremental improvement of results obtained using machine learning techniques.

Citations

19 Claims

1. A method, comprising:
- identifying a first region of an image;
  
  identifying a second region of the image;
  
  extracting contextual features from the first region of the image, the contextual features comprising at least one of gradient or intensity patterns;
  
  aggregating the contextual features;
  
  quantizing the aggregated extracted features as quantized contextual features of the first region;
  
  determining, using a first classifier, that the quantized contextual features are consistent with image data comprising a glyph;
  
  determining that the first region contains a glyph;
  
  determining that the second region does not contain a glyph;
  
  stopping further processing of the second region;
  
  identifying a candidate glyph corresponding to a maximally stable extremal region (MSER) in the first region of the image;
  
  determining a plurality of glyph feature descriptors based on the candidate glyph, including one or more of determining the candidate glyph'"'"'s aspect ratio, compactness, solidity, stroke-width to width, stroke-width to height, convexity, raw compactness, or a number of holes included in the candidate glyph;
  
  determining that the candidate glyph comprises a first glyph, using the determined glyph feature descriptors, the quantized contextual features, and a first model; and
  
  performing optical character recognition (OCR) on the first glyph.
- View Dependent Claims (2, 3)
- - 2. The method of claim 1, further comprising:
    - determining that the first region contains a glyph using a second model,wherein the identifying the candidate glyph, the determining the plurality of glyph feature descriptors, and the determining that the candidate glyph comprises the first glyph is based on the determination that the first region contains a glyph.
  - 3. The method of claim 2, further comprising:
    - extracting second contextual features from the second region of the image, the second contextual features comprising gradient or intensity patterns;
      
      aggregating the extracted second contextual features;
      
      quantizing the aggregated second contextual extracted features as the contextual features for the second region; and
      
      determining that the second region does not contain a glyph using the second classifier and the quantized second contextual features,wherein candidate glyphs are not identified for the second region of the image.

4. A computing device comprising:
- at least one processor;
  
  a memory including instructions operable to be executed by the at least one processor to perform a set of actions to configure the at least one processor to;
  
  identify a first region of an image;
  
  identify a second region of the image;
  
  extract first contextual features from the first region, the first contextual features relating to a context of the first region;
  
  process the extracted first contextual features using a first classifier to determine that the extracted first contextual features are consistent with image data comprising a glyph;
  
  determine that the first region contains a glyph;
  
  determine that the second region does not contain a glyph; and
  
  stop further processing of the second region in response to determining the second region does not contain a glyph;
  
  identify candidate locations of the first region, the candidate locations comprising a first candidate location having a first local pixel pattern;
  
  extract second contextual features from the first local pixel pattern;
  
  process, using a second classifier, the first local pixel pattern to determine a first feature descriptor, wherein the first feature descriptor is based on spatial relationships between the first local pixel pattern within the first region;
  
  process, using the second classifier, the first local pixel pattern to determine a second feature descriptor, wherein the second feature descriptor relates to content of the first candidate location; and
  
  determine that the first candidate location contains a glyph using the first feature descriptor and the second feature descriptor.
- View Dependent Claims (5, 6, 7, 8, 9, 10, 11)
- - 5. The computing device of claim 4, wherein the instructions to determine the first feature descriptor include instructions to:
    - extract features from the first region comprising gradient or intensity patterns;
      
      aggregate the extracted features; and
      
      quantize the aggregated extracted features as the first feature descriptor.
  - 6. The computing device of claim 4, wherein the instructions to determine the first feature descriptor include instructions for performing one or more of:
    - scale-invariant feature transform (SIFT),speeded up robust features (SURF),color SIFT, orlocal binary patterns (LBP).
  - 7. The computing device of claim 4, wherein the instructions to determine that the first candidate location contains the glyph configure the at least one processor to:
    - use the first feature descriptor, the second feature descriptor, and a first classifier model to determine a score; and
      
      determine that the score exceeds a threshold value.
  - 8. The computing device of claim 7, wherein the instructions to use the first feature descriptor, the second feature descriptor, and the first model to determine the score configure the at least one processor to classify the first candidate location using one or more of:
    - a Support Vector Machine (SVM) classifier;
      
      an SVM classifier employing a Radial Basis Function (RBF) kernel;
      
      a neural network;
      
      decision trees;
      
      adaptive boosting combined with decision trees;
      
      orrandom forests.
  - 9. The computing device of claim 4, further comprising instructions to:
    - select the second region of the image;
      
      determine a third feature descriptor corresponding to the second region, wherein the third feature descriptor is based on spatial relationships between pixels within the second region; and
      
      determine, using the third feature descriptor and the second model, that the second region of the image does not contain a glyph,wherein identification of candidate locations is not performed for the second region based on the determination that the second region does not contain a glyph.
  - 10. The computing device of claim 4, wherein the instructions to identify the candidate location comprise instructions to compute a maximally stable extremal region (MSER), a histogram of oriented gradients (HoG), or Gabor features.
  - 11. The computing device of claim 4, wherein the first region of the image is the entire image.

12. A non-transitory computer-readable storage medium storing processor-executable instructions to configure a computing device to:
- identify a first region of an image;
  
  identify a second region of the image;
  
  extract first contextual features from the first region, the first contextual features relating to a context of the first region;
  
  process the extracted first contextual features using a first classifier to determine that the extracted first contextual features are consistent with image data comprising a glyph;
  
  determine that the first region contains a glyph;
  
  determine that the second region does not contain a glyph;
  
  stop further processing of the second region in response to determining the second region does not contain a glyph;
  
  identify candidate locations of the first region, the candidate locations comprising a first candidate location having a first local pixel pattern;
  
  extract second contextual features from the first local pixel pattern;
  
  process, using a second classifier, the first local pixel pattern to determine a first feature descriptor, wherein the first feature descriptor is based on spatial relationships between the first local pixel pattern within the first region;
  
  process, using the second classifier, the first local pixel pattern to determine a second feature descriptor, wherein the second feature descriptor relates to content of the first candidate location; and
  
  determine that the first candidate location contains a glyph using the first feature descriptor and the second feature descriptor.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
- - 13. The non-transitory computer-readable storage medium of claim 12, wherein the instructions to determine the first feature descriptor include instructions to:
    - extract features from the first region comprising gradient or intensity patterns;
      
      aggregate the extracted features; and
      
      quantize the aggregated extracted features as the first feature descriptor.
  - 14. The non-transitory computer-readable storage medium of claim 12, wherein the instructions to determine the first feature descriptor include instructions for performing one or more of:
    - scale-invariant feature transform (SIFT),speeded up robust features (SURF),color SIFT, orlocal binary patterns (LBP).
  - 15. The non-transitory computer-readable storage medium of claim 12, wherein the instructions to determine that the candidate location contains the glyph configure the computing device to:
    - use the first feature descriptor, the second feature descriptor, and a first model to determine a score; and
      
      determine that the score exceeds a threshold value.
  - 16. The non-transitory computer-readable storage medium of claim 15, wherein the instructions to use the first feature descriptor, the second feature descriptor, and the first model to determine the score configure the computing device to classify the first candidate location using one or more of:
    - a Support Vector Machine (SVM) classifier;
      
      an SVM classifier employing a Radial Basis Function (RBF) kernel;
      
      a neural network;
      
      decision trees;
      
      adaptive boosting combined with decision trees;
      
      orrandom forests.
  - 17. The non-transitory computer-readable storage medium of claim 12, further comprising instructions to:
    - select the second region of the image;
      
      determine a third feature descriptor corresponding to the second region, wherein the third feature descriptor is based on spatial relationships between pixels within the second region; and
      
      determine, using the third feature descriptor and the second model, that the second region of the image does not contain a glyph,wherein identification of candidate locations is not performed for the second region based on the determination that the second region does not contain a glyph.
  - 18. The non-transitory computer-readable storage medium of claim 12, wherein the instructions to identify the candidate locations comprise instructions to compute a maximally stable extremal region (MSER), a histogram of oriented gradients (HoG), or Gabor features.
  - 19. The non-transitory computer-readable storage medium of claim 12, wherein the first region of the image is the entire image.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Original Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Inventors
Natarajan, Pradeep
Primary Examiner(s)
Werner, Brian P

Application Number

US14/463,746
Time in Patent Office

916 Days
Field of Search
US Class Current

1/1
CPC Class Codes

G06V 20/63 Scene text, e.g. street names

G06V 30/10 Character recognition

Leveraging image context for improved glyph classification

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Leveraging image context for improved glyph classification

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links