Optical character recognition of series of images

US 10,043,092 B2
Filed: 05/31/2016
Issued: 08/07/2018
Est. Priority Date: 05/13/2016
Status: Expired due to Fees

First Claim

Patent Images

1. A method, comprising:

receiving, by a processing device, a current image of a series of images of an original document, wherein the current image at least partially overlaps with a previous image of the series of images;

performing optical character recognition (OCR) of the current image to produce an OCR text and a corresponding text layout;

identifying, using the OCR text and the corresponding text layout, a plurality of textual artifacts in each of the current image and the previous image, wherein each textual artifact is represented by a sequence of symbols that has a frequency of occurrence within the OCR text falling below a threshold frequency;

identifying, in each of the current image and the previous image, a corresponding plurality of base points, wherein each base point is associated with at least one textural artifact of the plurality of textual artifacts;

identifying, using coordinates of matching base points in the current image and the previous image, parameters of a coordinate transformation converting coordinates of the previous image into coordinates of the current image;

associating, using the coordinate transformation, at least part of the OCR text with a cluster of a plurality of clusters of symbol sequences, wherein the OCR text is produced by processing the current image and wherein the symbol sequences are produced by processing one or more previously received images of the series of images;

identifying, for each cluster, a median string representing the cluster of symbol sequences; and

producing, using the median string, a resulting OCR text representing at least a portion of the original document.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and methods for performing optical character recognition (OCR) are disclosed. An example method may include receiving a current image that overlaps with a previous image of a series of images of an original document; performing OCR of the current image to produce an OCR text; identifying a plurality of textual artifacts in the images that are each represented by a sequence of symbols having a frequency of occurrence within the OCR text falling below a threshold frequency; identifying corresponding base points that are each associated with a textural artifact; identifying parameters of a coordinate transformation converting coordinates of the previous image into coordinates of the current image; associating part of the OCR text with a cluster of symbol sequences, the symbol sequences being produced by processing previously received images; identifying a median string representing the cluster; and producing a resulting OCR text representing a portion of the original document.

9 Citations

View as Search Results

20 Claims

1. A method, comprising:
- receiving, by a processing device, a current image of a series of images of an original document, wherein the current image at least partially overlaps with a previous image of the series of images;
  
  performing optical character recognition (OCR) of the current image to produce an OCR text and a corresponding text layout;
  
  identifying, using the OCR text and the corresponding text layout, a plurality of textual artifacts in each of the current image and the previous image, wherein each textual artifact is represented by a sequence of symbols that has a frequency of occurrence within the OCR text falling below a threshold frequency;
  
  identifying, in each of the current image and the previous image, a corresponding plurality of base points, wherein each base point is associated with at least one textural artifact of the plurality of textual artifacts;
  
  identifying, using coordinates of matching base points in the current image and the previous image, parameters of a coordinate transformation converting coordinates of the previous image into coordinates of the current image;
  
  associating, using the coordinate transformation, at least part of the OCR text with a cluster of a plurality of clusters of symbol sequences, wherein the OCR text is produced by processing the current image and wherein the symbol sequences are produced by processing one or more previously received images of the series of images;
  
  identifying, for each cluster, a median string representing the cluster of symbol sequences; and
  
  producing, using the median string, a resulting OCR text representing at least a portion of the original document.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The method of claim 1, wherein the current image and the previous image represent consecutive images of the series of images of the original document.
  - 3. The method of claim 1, wherein the current image and the previous image differ in at least one of:
    - image scale, a shooting angle, image brightness, or presence of an external object that is covering at least part of the original document.
  - 4. The method of claim 1, wherein identifying a base point further comprises determining a center of a minimum bounding rectangle of an associated textual artifact.
  - 5. The method of claim 1, further comprising filtering the identified base points using invariant geometric features of base point groupings.
  - 6. The method of claim 1, wherein the coordinate transformation is provided by a projective transformation.
  - 7. The method of claim 1, wherein the median string has a minimal sum of values of a pre-defined metric with respect to the cluster of symbol sequences.
  - 8. The method of claim 7, wherein the pre-defined metric represents an edit distance between the median string and a symbol sequence of the plurality of symbol sequences.
  - 9. The method of claim 1, wherein producing the median string comprises applying weight coefficients to each symbol sequence of the cluster of symbol sequences.
  - 10. The method of claim 1, wherein identifying the plurality of clusters of symbol sequences further comprises:
    - producing a graph comprising a plurality of nodes, wherein each node represents a symbol sequence, the graph further comprising a plurality of edges, wherein an edge connects a first symbol sequence produced by OCR of at least a part of a first image of the series of images and a second symbol sequence produced by OCR of a corresponding part of a second image of the series of images.
  - 11. The method of claim 1, further comprising:
    - identifying, in view of the text layout, an order of clusters of symbol sequences.
  - 12. The method of claim 1, wherein the OCR text is provided in a first natural language, the method further comprising:
    - translating the resulting OCR text to a second natural language.

13. A system, comprising:
- a memory;
  
  a processing device, coupled to the memory, the processing device configured to;
  
  receive a current image of a series of images of an original document, wherein the current image at least partially overlaps with a previous image of the series of images;
  
  perform optical character recognition (OCR) of the current image to produce an OCR text and a corresponding text layout;
  
  identify, using the OCR text and the corresponding text layout, a plurality of textual artifacts in each of the current image and the previous image, wherein each textual artifact is represented by a sequence of symbols that has a frequency of occurrence within the OCR text falling below a threshold frequency;
  
  identify, in each of the current image and the previous image, a corresponding plurality of base points, wherein each base point is associated with at least one textural artifact of the plurality of textual artifacts;
  
  identify, using coordinates of matching base points in the current image and the previous image, parameters of a coordinate transformation converting coordinates of the previous image into coordinates of the current image;
  
  associate, using the coordinate transformation, at least part of the OCR text with a cluster of a plurality of clusters of symbol sequences, wherein the OCR text is produced by processing the current image and wherein the symbol sequences are produced by processing one or more previously received images of the series of images;
  
  identify, for each cluster, a median string representing the cluster of symbol sequences; and
  
  produce, using the median string, a resulting OCR text representing at least a portion of the original document.
- View Dependent Claims (14, 15, 16)
- - 14. The system of claim 13, wherein identifying a base point further comprises determining a center of a minimum bounding rectangle of an associated textual artifact.
  - 15. The system of claim 13, wherein the median string has a minimal sum of values of a pre-defined metric with respect to the cluster of symbol sequences.
  - 16. The system of claim 13, wherein identifying the plurality of clusters of symbol sequences further comprises:
    - producing a graph comprising a plurality of nodes, wherein each node represents a symbol sequence, the graph further comprising a plurality of edges, wherein an edge connects a first symbol sequence produced by OCR of at least a part of a first image of the series of images and a second symbol sequence produced by OCR of a corresponding part of a second image of the series of images.

17. A computer-readable non-transitory storage medium comprising executable instructions that, when executed by a processing device, cause the processing device to:
- receive a current image of a series of images of an original document, wherein the current image at least partially overlaps with a previous image of the series of images;
  
  perform optical character recognition (OCR) of the current image to produce an OCR text and a corresponding text layout;
  
  identify, using the OCR text and the corresponding text layout, a plurality of textual artifacts in each of the current image and the previous image, wherein each textual artifact is represented by a sequence of symbols that has a frequency of occurrence within the OCR text falling below a threshold frequency;
  
  identify, in each of the current image and the previous image, a corresponding plurality of base points, wherein each base point is associated with at least one textural artifact of the plurality of textual artifacts;
  
  identify, using coordinates of matching base points in the current image and the previous image, parameters of a coordinate transformation converting coordinates of the previous image into coordinates of the current image;
  
  associate, using the coordinate transformation, at least part of the OCR text with a cluster of a plurality of clusters of symbol sequences, wherein the OCR text is produced by processing the current image and wherein the symbol sequences are produced by processing one or more previously received images of the series of images;
  
  identify, for each cluster, a median string representing the cluster of symbol sequences; and
  
  produce, using the median string, a resulting OCR text representing at least a portion of the original document.
- View Dependent Claims (18, 19, 20)
- - 18. The computer-readable non-transitory storage medium of claim 17, wherein identifying a base point further comprises determining a center of a minimum bounding rectangle of an associated textual artifact.
  - 19. The computer-readable non-transitory storage medium of claim 17, wherein the median string has a minimal sum of values of a pre-defined metric with respect to the cluster of symbol sequences.
  - 20. The computer-readable non-transitory storage medium of claim 17, wherein identifying the plurality of clusters of symbol sequences further comprises:
    - producing a graph comprising a plurality of nodes, wherein each node represents a symbol sequence, the graph further comprising a plurality of edges, wherein an edge connects a first symbol sequence produced by OCR of at least a part of a first image of the series of images and a second symbol sequence produced by OCR of a corresponding part of a second image of the series of images.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
ABBYY Development LLC
Original Assignee
ABBYY Development LLC
Inventors
Kalyuzhny, Aleksey
Primary Examiner(s)
Cunningham, Gregory F

Application Number

US15/168,548
Publication Number

US 20170330049A1
Time in Patent Office

798 Days
Field of Search

382112
US Class Current
CPC Class Codes

G06F 18/23   Clustering techniques

G06F 18/2323   based on graph theory, e.g....

G06F 18/24   Classification techniques

G06T 3/40   Scaling of whole images or ...

G06V 10/7635   based on graphs, e.g. graph...

G06V 20/62   Text, e.g. of license plate...

G06V 30/10   Character recognition

G06V 30/1607   Correcting image deformatio...

G06V 30/184   by analysing segments inter...

G06V 30/19107   Clustering techniques

G06V 30/412   Layout analysis of document...

G06V 30/418   Document matching, e.g. of ...

Optical character recognition of series of images

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

9 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Optical character recognition of series of images

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

9 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links