×

Document image processing apparatus, document image processing method, and computer-readable recording medium having recorded document image processing program

  • US 8,611,666 B2
  • Filed: 03/19/2010
  • Issued: 12/17/2013
  • Est. Priority Date: 03/27/2009
  • Status: Expired due to Fees
First Claim
Patent Images

1. A document image processing apparatus comprising:

  • a memory for storing a document image; and

    a controller for controlling extraction of an index region from said document image, wherein said controller is configured toi) classify a plurality of character string element regions constituting said document image into small regions and large regions,ii) determine each small region positioned just before said large region according to a reading order as a first candidate, as a first determining process,iii) determine at least one part of said first candidates as a first index, by performing an evaluating process to evaluate whether or not each said first candidate is an index, based on a difference in feature from the related large region, with respect to each said first candidate, as a first evaluating process,iv) determine each small region positioned just before said first index according to the reading order as a second candidate, as a second determining process,v) determine at least one part of said second candidates as a second index, by performing an evaluating process to evaluate whether or not said second candidate is the index, based on a difference in feature from the related first index, with respect to each said second candidate, as a second evaluating process, andvi) extract the small regions determined as said first index and said second index, as said index region whereinin said first evaluating process, said controller sets a first feature section for each said first candidate as for a style type different in feature from a corresponding related large region that represents said related large region corresponding to the intended first candidate among a plurality of style types, said first feature section including a feature of said intended first candidate region but not including a feature of said corresponding related large region,groups into region groups at least one or both of the related large regions and the first candidates having the feature included in said set first feature section,calculates a first index evaluation degree, based on a number of members of each region group with respect to each said first candidate, anddetermines whether or not a logical element of each said first candidate is the index, based on said calculated first index evaluation degree, andin said second evaluating process, the controller sets a second feature section for each said second candidate as for a style type different in feature from a corresponding related first index that represents said related first index corresponding to the intended second candidate among said plurality of style types, said second feature section including a feature of said intended second candidate region but not including a feature of said corresponding related first index,groups into region groups at least one or both of the related first indexes and the second candidates having the feature included in said set second feature section,calculates a second index evaluation degree, based on a number of members of each region group with respect to each said second candidate, anddetermines whether or not a logical element of each said second candidate is the index, based on said calculated second index evaluation degree.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×