Leveraging image context for improved glyph classification
First Claim
Patent Images
1. A method, comprising:
- identifying a first region of an image;
identifying a second region of the image;
extracting contextual features from the first region of the image, the contextual features comprising at least one of gradient or intensity patterns;
aggregating the contextual features;
quantizing the aggregated extracted features as quantized contextual features of the first region;
determining, using a first classifier, that the quantized contextual features are consistent with image data comprising a glyph;
determining that the first region contains a glyph;
determining that the second region does not contain a glyph;
stopping further processing of the second region;
identifying a candidate glyph corresponding to a maximally stable extremal region (MSER) in the first region of the image;
determining a plurality of glyph feature descriptors based on the candidate glyph, including one or more of determining the candidate glyph'"'"'s aspect ratio, compactness, solidity, stroke-width to width, stroke-width to height, convexity, raw compactness, or a number of holes included in the candidate glyph;
determining that the candidate glyph comprises a first glyph, using the determined glyph feature descriptors, the quantized contextual features, and a first model; and
performing optical character recognition (OCR) on the first glyph.
1 Assignment
0 Petitions
Accused Products
Abstract
A system to recognize text or symbols contained in a captured image using machine learning models leverages context information about the image to improve accuracy. Contextual information is determined for the entire image, or spatial regions of the images, and is provided to a machine learning model when a determination is made as to whether a region does or does not contain text or symbols. Associating features related to the larger context with features extracted from regions potentially containing text or symbolic content provides an incremental improvement of results obtained using machine learning techniques.
-
Citations
19 Claims
-
1. A method, comprising:
-
identifying a first region of an image; identifying a second region of the image; extracting contextual features from the first region of the image, the contextual features comprising at least one of gradient or intensity patterns; aggregating the contextual features; quantizing the aggregated extracted features as quantized contextual features of the first region; determining, using a first classifier, that the quantized contextual features are consistent with image data comprising a glyph; determining that the first region contains a glyph; determining that the second region does not contain a glyph; stopping further processing of the second region; identifying a candidate glyph corresponding to a maximally stable extremal region (MSER) in the first region of the image; determining a plurality of glyph feature descriptors based on the candidate glyph, including one or more of determining the candidate glyph'"'"'s aspect ratio, compactness, solidity, stroke-width to width, stroke-width to height, convexity, raw compactness, or a number of holes included in the candidate glyph; determining that the candidate glyph comprises a first glyph, using the determined glyph feature descriptors, the quantized contextual features, and a first model; and performing optical character recognition (OCR) on the first glyph. - View Dependent Claims (2, 3)
-
-
4. A computing device comprising:
-
at least one processor; a memory including instructions operable to be executed by the at least one processor to perform a set of actions to configure the at least one processor to; identify a first region of an image; identify a second region of the image; extract first contextual features from the first region, the first contextual features relating to a context of the first region; process the extracted first contextual features using a first classifier to determine that the extracted first contextual features are consistent with image data comprising a glyph; determine that the first region contains a glyph; determine that the second region does not contain a glyph; and stop further processing of the second region in response to determining the second region does not contain a glyph; identify candidate locations of the first region, the candidate locations comprising a first candidate location having a first local pixel pattern; extract second contextual features from the first local pixel pattern; process, using a second classifier, the first local pixel pattern to determine a first feature descriptor, wherein the first feature descriptor is based on spatial relationships between the first local pixel pattern within the first region; process, using the second classifier, the first local pixel pattern to determine a second feature descriptor, wherein the second feature descriptor relates to content of the first candidate location; and determine that the first candidate location contains a glyph using the first feature descriptor and the second feature descriptor. - View Dependent Claims (5, 6, 7, 8, 9, 10, 11)
-
-
12. A non-transitory computer-readable storage medium storing processor-executable instructions to configure a computing device to:
-
identify a first region of an image; identify a second region of the image; extract first contextual features from the first region, the first contextual features relating to a context of the first region; process the extracted first contextual features using a first classifier to determine that the extracted first contextual features are consistent with image data comprising a glyph; determine that the first region contains a glyph; determine that the second region does not contain a glyph; stop further processing of the second region in response to determining the second region does not contain a glyph; identify candidate locations of the first region, the candidate locations comprising a first candidate location having a first local pixel pattern; extract second contextual features from the first local pixel pattern; process, using a second classifier, the first local pixel pattern to determine a first feature descriptor, wherein the first feature descriptor is based on spatial relationships between the first local pixel pattern within the first region; process, using the second classifier, the first local pixel pattern to determine a second feature descriptor, wherein the second feature descriptor relates to content of the first candidate location; and determine that the first candidate location contains a glyph using the first feature descriptor and the second feature descriptor. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
-
Specification