×

Document image binarization method

  • US 9,367,899 B1
  • Filed: 05/29/2015
  • Issued: 06/14/2016
  • Est. Priority Date: 05/29/2015
  • Status: Active Grant
First Claim
Patent Images

1. A method implemented in a data processing system which includes a processor and a memory, for binarizing a multi-bit document image, comprising:

  • (a) binarizing the document image a plurality of times, each time using one of a plurality of different binarization thresholds, to generate a plurality of corresponding binary images;

    for each of the binary images,(b) applying connected component analysis to the binary image to identify connected components in the binary image;

    (c) identifying all connected components in the binary image that are larger than a threshold size and have fill rates higher than a fill rate threshold and removing all connected components contained within bounding boxes of the identified connected components; and

    (d) counting a first number of connected components in the binary image that have sizes equal to or larger than a first threshold size, and counting a second number of connected components in the binary image that have sizes equal to or smaller than a second threshold size;

    (e) based on the first number and the second number of each binary image, selecting one of the binary images as the optimum binary image; and

    (f) outputting the optimum binary image.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×