Systems and methods for classifying objects in digital images captured using mobile devices

US 9,311,531 B2
Filed: 03/13/2014
Issued: 04/12/2016
Est. Priority Date: 03/13/2013
Status: Active Grant

First Claim

Patent Images

1. A method, comprising:

receiving or capturing a digital image using a mobile device;

using a processor of the mobile device to;

determine whether an object depicted in the digital image belongs to a particular object class among a plurality of object classes based on feature-space discrimination wherein the feature space discrimination utilizes one or more of support-vector-machine (SVM) techniques, transductive classification techniques, and maximum entropy discrimination (MED) techniques;

determine one or more object features of the object based at least in part on the particular object class at least partially in response to determining the object belongs to the particular object class;

build or select an extraction model based at least in part on the one or more determined object features; and

extract data from the digital image using the extraction model, the extracting comprising detecting one or more lines of text in the object, and the detecting comprising;

projecting the digital image onto a single dimension;

projecting each color channel of the digital image onto a single channel along the single dimensiondetermining a distribution of light and dark areas along the projection;

determining a plurality of dark pixel densities, each dark pixel density corresponding to a position along the projection;

determining whether each dark pixel density is greater than a probable text line threshold; and

designating each position as a text line upon determining the corresponding dark pixel density is greater than the probable text line threshold.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method includes receiving or capturing a digital image using a mobile device, and using a processor of the mobile device to: determine whether an object depicted in the digital image belongs to a particular object class among a plurality of object classes; determine one or more object features of the object based at least in part on the particular object class at least partially in response to determining the object belongs to the particular object class; build or select an extraction model based at least in part on the one or more determined object features; and extract data from the digital image using the extraction model. The extraction model excludes, and/or the extraction process does not utilize, optical character recognition (OCR) techniques. Related systems and computer program products are also disclosed.

514 Citations

22 Claims

1. A method, comprising:
- receiving or capturing a digital image using a mobile device;
  
  using a processor of the mobile device to;
  
  determine whether an object depicted in the digital image belongs to a particular object class among a plurality of object classes based on feature-space discrimination wherein the feature space discrimination utilizes one or more of support-vector-machine (SVM) techniques, transductive classification techniques, and maximum entropy discrimination (MED) techniques;
  
  determine one or more object features of the object based at least in part on the particular object class at least partially in response to determining the object belongs to the particular object class;
  
  build or select an extraction model based at least in part on the one or more determined object features; and
  
  extract data from the digital image using the extraction model, the extracting comprising detecting one or more lines of text in the object, and the detecting comprising;
  
  projecting the digital image onto a single dimension;
  
  projecting each color channel of the digital image onto a single channel along the single dimensiondetermining a distribution of light and dark areas along the projection;
  
  determining a plurality of dark pixel densities, each dark pixel density corresponding to a position along the projection;
  
  determining whether each dark pixel density is greater than a probable text line threshold; and
  
  designating each position as a text line upon determining the corresponding dark pixel density is greater than the probable text line threshold.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method as recited in claim 1, wherein extracting the data using the extraction model further comprises performing optical character recognition (OCR), wherein the OCR is performed on a selected portion of the digital image excluding one or more portions of the received or captured image.
  - 3. The method as recited in claim 1, wherein the feature space discrimination utilizes support-vector-machine (SVM) techniques.
  - 4. The method as recited in claim 1, wherein the feature space discrimination comprises identifying a hyperplane separating the plurality of object classes in an N-dimensional feature space.
  - 5. The method as recited in claim 1,wherein the extraction model is built, wherein building the extraction model comprises:
    - mapping one or more of a feature vector, a list of feature vectors and a feature matrix to one or more of the object features; and
      
      associating at least one metadata label with each mapped object feature.
  - 6. The method as recited in claim 1, further comprising:
    - training the extraction model based on one or more additional object features of at least one additional object belonging to the object class.
  - 7. The method as recited in claim 6, wherein the extraction model is trained using the processor of the mobile device according to a support vector machine (SVM) technique;
    - and the method further comprising storing and/or exporting the trained extraction model.
  - 8. The method as recited in claim 1, further comprising:
    - building a new extraction model based on some or all of the determined object features; and
      
      extracting data from the digital image using the new extraction model.
  - 9. The method as recited in claim 1, further comprising:
    - performing OCR on some or all of the extracted data.
  - 10. The method as recited in claim 1, further comprising:
    - associating a plurality of metadata labels with the digital image based on the particular object class, wherein each metadata label identifies one or more of;
      
      a type of data depicted in the digital image;
      
      location information; and
      
      relevance of data to one or more subsequent processing operations.

11. A method, comprising:
- receiving or capturing a digital image using a mobile device;
  
  using a processor of the mobile device;
  
  determining whether an object depicted in the digital image belongs to a particular object class among a plurality of object classes;
  
  displaying the digital image on a display of the mobile device upon determining the object does not belong to any of the plurality of object classes;
  
  receiving user input via the display of the mobile device, the user input identifying one or more regions of interest in the object;
  
  building a feature vector based at least in part on the user input;
  
  building and/or selecting an extraction model based at least in part on the feature vector;
  
  extracting data from the digital image based at least in part on the extraction model; and
  
  detecting one or more lines of text in the digital image, the detecting comprisingdetecting a plurality of connected components non-background elements in the digital image, anddetermining a plurality of likely characters based on the plurality of connected components, wherein determining the plurality of likely characters comprises determining whether each of the plurality of connected components is characterized by a predetermined number of light-to-dark transitions in a predetermined direction.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20, 21)
- - 12. The method as recited in claim 11, wherein the extracting further comprises performing optical character recognition (OCR), wherein the OCR is performed on a selected portion of the digital image excluding one or more portions of the received or captured image.
  - 13. The method as recited in claim 11, wherein the extracting excludes performing optical character recognition (OCR), and the method further comprising performing optical character recognition on the extracted data.
  - 14. The method as recited in claim 11, further comprisinggenerating a new object class based at least in part on the user input, andwherein the extraction model is configured to extract data from a plurality of objects belonging to the new object class.
  - 15. The method as recited in claim 11, further comprising validating the extracted data.
  - 16. The method as recited in claim 11, wherein building the extraction model comprises:
    - mapping one or more of a feature vector, a list of feature vectors and a feature matrix to one or more object features; and
      
      associating at least one metadata label with each mapped object feature, wherein the metadata label(s) are associated with the digital image.
  - 17. The method as recited in claim 11, further comprising:
    - training the extraction model based on one or more additional object features of at least one additional object belonging to the object class, and wherein the at least one additional object comprises at least four additional objects.
  - 18. The method as recited in claim 11, further comprising:
    - performing OCR on one or more regions of the digital image corresponding to one or more of the object features and/or other object features.
  - 19. The method as recited in claim 11, further comprising:
    - detecting one or more lines of text in the object, the detecting comprising;
      
      projecting the digital image onto a single dimension;
      
      determining a distribution of light and dark areas along the projection;
      
      determining a plurality of dark pixel densities, each dark pixel density corresponding to a position along the projection;
      
      determining whether each dark pixel density is greater than a probable text line threshold; and
      
      designating each position as a text line upon determining the corresponding dark pixel density is greater than the probable text line threshold.
  - 20. The method as recited in claim 11, wherein building the extraction model further comprises training the extraction model using one or more of support-vector-machine (SVM) techniques, transductive classification techniques, and maximum entropy discrimination (MED) techniques.
  - 21. The method as recited in claim 11, further comprising:
    - projecting the digital image onto a single dimension; and
      
      projecting each color channel of the digital image onto a single channel along the single dimension.

22. A computer program product comprising:
- non-transitory computer readable storage medium having program code embodied therewith, the program code readable/executable by a mobile device comprising a processor to;
  
  receive or capture a digital image using the mobile device;
  
  use the processor to;
  
  determine whether an object depicted in the digital image belongs to a particular object class among a plurality of object classes based on feature-space discrimination, wherein the feature space discrimination utilizes one or more of support-vector-machine (SVM) techniques, transductive classification techniques, and maximum entropy discrimination (MED) techniques;
  
  determine one or more object features of the object based at least in part on the particular object class and at least partially in response to determining the object belongs to the particular object class;
  
  build or select an extraction model based at least in part on the one or more determined object features; and
  
  extract data from the digital image using the extraction model, the extracting comprising detecting one or more lines of text in the object, and the detecting comprising;
  
  projecting the digital image onto a single dimension;
  
  projecting each color channel of the digital image onto a single channel along the single dimensiondetermining a distribution of light and dark areas along the projection;
  
  determining a plurality of dark pixel densities, each dark pixel density corresponding to a position along the projection;
  
  determining whether each dark pixel density is greater than a probable text line threshold; and
  
  designating each position as a text line upon determining the corresponding dark pixel density is greater than the probable text line threshold.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Kofax Incorporated
Original Assignee
Kofax Incorporated
Inventors
Amtrup, Jan W., Macciola, Anthony, Thompson, Stephen Michael, Ma, Jiyong
Primary Examiner(s)
Dunphy, David F

Application Number

US14/209,825
Publication Number

US 20140270536A1
Time in Patent Office

761 Days
Field of Search

None
US Class Current

1/1
CPC Class Codes

G06F 2218/08   Feature extraction

G06F 2218/12   Classification; Matching

G06T 2207/20104   Interactive definition of r...

G06T 7/11   Region-based segmentation

G06V 30/40   Document-oriented image-bas...

G06V 30/418   Document matching, e.g. of ...

Systems and methods for classifying objects in digital images captured using mobile devices

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

514 Citations

22 Claims

Specification

Solutions

Use Cases

Quick Links

Systems and methods for classifying objects in digital images captured using mobile devices

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

514 Citations

22 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links