×

IMAGE PROCESSING APPARATUS AND IMAGE PROCESSING METHOD

  • US 20120011429A1
  • Filed: 07/06/2011
  • Published: 01/12/2012
  • Est. Priority Date: 07/08/2010
  • Status: Abandoned Application
First Claim
Patent Images

1. An image processing apparatus comprising:

  • an input unit configured to input a document including a plurality of page images;

    a region segmentation unit configured to divide each page image input by the input unit into attribute regions;

    a character recognition unit configured to execute character recognition processing on the regions divided by the region segmentation unit;

    a first detection unit configured to detect a first anchor expression constituted by a specific character string from a result of the character recognition processing executed by the character recognition unit on a text attribute region in the page image;

    a first identifier allocation unit configured to allocate a first link identifier to the first anchor expression detected by the first detection unit;

    a first graphic data generation unit configured to generate graphic data to be used to identify the first anchor expression detected by the first detection unit and associate the generated graphic data with the first link identifier allocated by the first identifier allocation unit;

    a first table updating unit configured to register the first link identifier and the first anchor expression in a link configuration management table while associating them with each other and, if an anchor expression similar to the first anchor expression is already registered in the link configuration management table, configured to update the link configuration management table in such a way as to mutually associate the link identifiers of the same anchor expression;

    a second detection unit configured to detect a second anchor expression constituted by a specific character string from a result of the character recognition processing executed by the character recognition unit on a caption region accompanying an object in the page image;

    a second identifier allocation unit configured to allocate a second link identifier to the object accompanied by the caption region where the second anchor expression is detected;

    a second graphic data generation unit configured to generate graphic data to be used to identify the object accompanied by the caption region where the second anchor expression is detected and associate the generated graphic data with the second link identifier allocated by the second identifier allocation unit;

    a second table updating unit configured to register the second link identifier and the second anchor expression in the link configuration management table while associating them with each other and, if an anchor expression similar to the second anchor expression is already registered in the link configuration management table, configured to update the link configuration management table in such a way as to mutually associate the link identifiers of the same anchor expression;

    a page data generation unit configured to generate page data of an electronic document for the page image, using the first link identifier, the first graphic data, the second link identifier, and the second graphic data;

    a first transmission unit configured to transmit the page data of the electronic document generated by the page data generation unit;

    a control unit configured to successively designate each page of the page image input by the input unit as a processing target and control processing repetitively executed by the region segmentation unit, the character recognition unit, the first detection unit, the first identifier allocation unit, the first graphic data generation unit, the first table updating unit, the second detection unit, the second identifier allocation unit, the second graphic data generation unit, the second table updating unit, the page data generation unit, and the first transmission unit; and

    a second transmission unit configured to generate link configuration information to be used to link the first link identifier with the second link identifier included in the electronic document based on the link configuration management table updated by the first table updating unit and the second table updating unit, and configured to transmit the generated link configuration information.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×