×

System and method for segmenting text lines in documents

  • US 8,768,057 B2
  • Filed: 11/15/2012
  • Issued: 07/01/2014
  • Est. Priority Date: 07/10/2009
  • Status: Expired
First Claim
Patent Images

1. A method of classifying marking types on images of a document, the method comprising:

  • supplying the document containing the images to a segmenter;

    segmenting the images received by the segmenter including identifying neatly written or printed text by grouping selected feature points along predetermined orientations, the feature points including local extrema of bounding contours of connected components, and subtracting enclosing boundary boxes of text lines from remaining document material to fragment connected components that are part of the text lines and part of extraneous markings;

    supplying the fragments to a classifier, the classifier providing a category score to each fragment, wherein the classifier is trained from groundtruth images whose pixels are labeled according to known marking types; and

    assigning a same label to all pixels in a fragment when the fragment is classified by the classifier.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×