×

Document image processing method and apparatus

  • US 20120045129A1
  • Filed: 05/18/2011
  • Published: 02/23/2012
  • Est. Priority Date: 08/17/2010
  • Status: Active Grant
First Claim
Patent Images

1. A method for processing a document image, comprising:

  • performing horizontal text line extraction on the document image, to obtain horizontal text lines, the number of rows of the horizontal text lines being represented by Nh;

    performing vertical text line extraction on the document image, to obtain vertical text lines, the number of columns of the vertical text lines being represented by Nv;

    providing an overlapping matrix represented by MO with Nh rows and Nv columns, a value of an element represented by MO(i, j) of the ith row and the jth column of the overlapping matrix MO indicating an overlapping relation between the ith row of horizontal text lines and the jth column of vertical text lines, where 1≦

    i≦

    Nh and 1≦

    j≦

    Nv;

    merging the overlapping matrix MO in the vertical direction, so that a value of an element of the overlapping matrix MO indicating an overlapping relation between a column of vertical text lines and each of a plurality of rows of horizontal text lines is set as a same value if the column of vertical text lines overlaps with the plurality of rows of horizontal text lines simultaneously;

    merging the overlapping matrix MO in the horizontal direction, so that a value of an element of the overlapping matrix MO indicating an overlapping relation between a row of horizontal text lines and each of a plurality of columns of vertical text lines is set as a same value if the row of horizontal text lines overlaps with the plurality of columns of vertical text lines simultaneously;

    determining one or more text overlapping regions in the document image, based on the values of the elements of the merged overlapping matrix MO;

    counting the total number of strokes or pixel points in the horizontal text lines and in the vertical text lines, respectively, within one of the one or more text overlapping regions; and

    determining an orientation of the one of the one or more text overlapping regions is a horizontal orientation if the total number of strokes or pixel points in the horizontal text lines is larger than that in the vertical text lines, otherwise, determining the orientation of the one of the one or more text overlapping regions is a vertical orientation.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×