Image processing using multiple aspect ratios

US 9,418,283 B1
Filed: 08/20/2014
Issued: 08/16/2016
Est. Priority Date: 08/20/2014
Status: Expired due to Fees

First Claim

Patent Images

1. A method, comprising:

obtaining an image;

identifying a candidate location in the image that likely contains a text character;

determining candidate features for the candidate location;

downscaling the image into a first downscaled image having a first resolution and a first aspect ratio;

downscaling the image into a second downscaled image having a second resolution and a second aspect ratio, wherein the first resolution is different from the second resolution, and the first aspect ratio is different than the second aspect ratio;

generating first contextual image gradient features for the first downscaled image comprising first magnitude values and first angles;

generating second contextual image gradient features for the second downscaled image comprising second magnitude values and second angles;

normalizing the first magnitude values features to a uniform scale, producing normalized first contextual image gradient features;

normalizing the second magnitude values to the uniform scale, producing normalized second contextual image gradient features;

combining the normalized first contextual image gradient features, the normalized second contextual image gradient features, and the candidate features; and

determining that the candidate location contains at least one text character, using the combined features and at least one classifier model,wherein the first and second magnitude values are approximated, and the first and second angles are quantized.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system to recognize text, objects, or symbols in a captured image using machine learning models reduces computational overhead by generating a plurality of thumbnail versions of the image at different downscaled resolutions and aspect ratios, and then processing the downscaled images instead of the entire image, or sections of the entire image. The downscaled images are processed to produce a combine feature vector characterizing the overall image. The combined feature vector is processed using the machine learning model.

Citations

17 Claims

1. A method, comprising:
- obtaining an image;
  
  identifying a candidate location in the image that likely contains a text character;
  
  determining candidate features for the candidate location;
  
  downscaling the image into a first downscaled image having a first resolution and a first aspect ratio;
  
  downscaling the image into a second downscaled image having a second resolution and a second aspect ratio, wherein the first resolution is different from the second resolution, and the first aspect ratio is different than the second aspect ratio;
  
  generating first contextual image gradient features for the first downscaled image comprising first magnitude values and first angles;
  
  generating second contextual image gradient features for the second downscaled image comprising second magnitude values and second angles;
  
  normalizing the first magnitude values features to a uniform scale, producing normalized first contextual image gradient features;
  
  normalizing the second magnitude values to the uniform scale, producing normalized second contextual image gradient features;
  
  combining the normalized first contextual image gradient features, the normalized second contextual image gradient features, and the candidate features; and
  
  determining that the candidate location contains at least one text character, using the combined features and at least one classifier model,wherein the first and second magnitude values are approximated, and the first and second angles are quantized.

2. A computing device comprising:
- at least one processor;
  
  a memory including instruction operable to be executed by the at least one processor to perform a set of actions to configure the at least one processor to;
  
  downscale an image into a first downscaled image having a first resolution and a first aspect ratio;
  
  downscale the image into a second downscaled image having a second resolution that is different than the first resolution and a second aspect ratio that is different than the first aspect ratio;
  
  generate first image gradient features for the first downscaled image, wherein the instructions to generate the first image gradient features include instructions to;
  
  determine an X gradient for a pixel of the first downscaled image,determine a Y gradient for the pixel;
  
  approximate a magnitude of a gradient vector associated with the pixel within the first downscaled image based on the X gradient and the Y gradient, andassign the gradient vector a quantized angle value;
  
  generate second image gradient features for the second downscaled image;
  
  concatenate the first image gradient features and the second image gradient features; and
  
  process the concatenated image gradient features to identify an object in the image or determine a characteristic of the image.
- View Dependent Claims (3, 4, 5, 6, 7, 8, 9)
- - 3. The computing device of claim 2, wherein the instructions to assign the gradient vector the quantized angle value include instructions to:
    - determine an angle for the X and Y gradients; and
      
      convert the angle into the quantized angle value.
  - 4. The computing device of claim 2, wherein the instructions to assign the quantized angle value to the gradient vector include instructions to:
    - assign the quantized angle value for the gradient vector based on a comparison of the X gradient with the Y gradient.
  - 5. The computing device of claim 2, wherein the characteristic of the image is a likelihood that at least a portion of the image does not comprise a text character, glyph, or object.
  - 6. The computing device of claim 2, wherein the instructions to process the concatenated image gradient features use a classifier system to identify the object or determine the characteristic.
  - 7. The computing device of claim 2, wherein the instructions further configure the at least one processor to:
    - identify a sub-region of the image as being likely to contain a text character, glyph, or object;
      
      generate third image gradient features for the sub-region; and
      
      concatenate the third image gradient features with the first and second image gradient features,wherein the instructions to process the concatenated image gradient features process the third image gradient features concatenated with the first and second image gradient features.
  - 8. The computing device of claim 7, wherein the sub-region is identified as a maximally stable extremal region (MSER).
  - 9. The computing device of claim 2, wherein the object comprises a text character or a glyph.

10. A non-transitory computer-readable storage medium storing processor-executable instructions for controlling a computing device, comprising program code to configure the computing device to:
- downscale an image into a first downscaled image having a first resolution and a first aspect ratio;
  
  downscale the image into a second downscaled image having a second resolution that is different than the first resolution and a second aspect ratio that is different than the first aspect ratio;
  
  generate first image gradient features for the first downscaled image, wherein the program code to generate the first image gradient features further configuring the computing device to;
  
  determine an X gradient for a pixel of the first downscaled image,determine a Y gradient for the pixel;
  
  approximate a magnitude of a gradient vector associated with the pixel within the first downscaled image based on the X gradient and the Y gradient, andassign the gradient vector a quantized angle value;
  
  generate second image gradient features for the second downscaled image;
  
  concatenate the first image gradient features and the second image gradient features; and
  
  process the concatenated image gradient features to identify an object in the image or determine a characteristic of the image.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
- - 11. The non-transitory computer-readable storage medium of claim 10, wherein the program code to assign the gradient vector the quantized angle value further configures the computing device to:
    - determine an angle for the X and Y gradients; and
      
      convert the angle into the quantized angle value.
  - 12. The non-transitory computer-readable storage medium of claim 10, wherein the program code to assign the quantized angle value to the gradient vector include instructions to:
    - assign the quantized angle value for the gradient vector based on a comparison of the X gradient with the Y gradient.
  - 13. The non-transitory computer-readable storage medium of claim 10, wherein the characteristic of the image is a likelihood that at least a portion of the image does not comprise a text character, glyph, or object.
  - 14. The non-transitory computer-readable storage medium of claim 10, wherein the program code to process the concatenated image gradient features uses classifier system to identify the object or determine the characteristic.
  - 15. The non-transitory computer-readable storage medium of claim 10, wherein the program code further configures the computing device to:
    - identify a sub-region of the image as being likely to contain a text character, glyph, or object;
      
      generate third image gradient features for the sub-region; and
      
      concatenate the third image gradient features with the first and second image gradient features,wherein the instruction to process the concatenated image gradient features process the third image gradient features concatenated with the first and second image gradient features.
  - 16. The non-transitory computer-readable storage medium of claim 15, wherein the sub-region is identified as a maximally stable extremal region (MSER).
  - 17. The non-transitory computer-readable storage medium of claim 10, wherein the object comprises a text character or glyph.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Original Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Inventors
Natarajan, Pradeep, Sikka, Avnish, Prasad, Rohit
Primary Examiner(s)
Koziol, Stephen R
Assistant Examiner(s)
Sun, Jiangeng

Application Number

US14/463,961
Time in Patent Office

727 Days
Field of Search

None
US Class Current

1/1
CPC Class Codes

G06T 3/40   Scaling of whole images or ...

G06V 10/32   Normalisation of the patter...

G06V 20/63   Scene text, e.g. street names

G06V 30/2504   Coarse or fine approaches, ...

G06V 30/414   Extracting the geometrical ...

Image processing using multiple aspect ratios

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Image processing using multiple aspect ratios

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links