×

Document image generation apparatus, document image generation method and recording medium

  • US 8,503,786 B2
  • Filed: 11/05/2010
  • Issued: 08/06/2013
  • Est. Priority Date: 11/06/2009
  • Status: Active Grant
First Claim
Patent Images

1. A document image generation apparatus that generates, on the basis of an image representing a document including plural lines, an image representing a supplementary annotation added document in which a supplementary annotation corresponding to a word or a phrase composed of plural words included in the document is added, comprising:

  • an original document image obtaining component configured to obtain an original document image representing a document, wherein the original document image obtaining component is configured to obtain the original document image from a scanner, and wherein further the document is a text document;

    a character recognizing component including a memory and processor configured to recognize a character included in the original document image obtained by the original document image obtaining component and identifies a position of the character in the original document image;

    a supplementary annotation obtaining component including a memory and processor configured to determine a meaning of a word or a phrase included in the document constructed of a plurality of the recognized characters by the character recognizing component through a natural language processing performed on the document, and obtains a supplementary annotation corresponding to the meaning of each word or phrase;

    a position determining component including a memory and processor configured to determine, as a position at which the obtained supplementary annotation corresponding to each word or phrase should be placed in a document, a position in an interline space near a word or a phrase in an original document image on the basis of a position of the character recognized by the character recognizing component, wherein the position determining component further comprises,a phrase judging component configured to judge whether a phrase for which a supplementary annotation is obtained is a discontinuous phrase in which plural words included in the phrase are discontinuously placed in the document; and

    an annotation arrangement position determining component configured to determine, as a position at which the supplementary annotation should be placed in a document, a position in an interline space in the original document image near any one of a head word in a discontinuous phrase, a continuous word string included in the discontinuous phrase and the longest word in the discontinuous phrase, in the case that the phrase for which a supplementary annotation is obtained is the discontinuous phrase; and

    an image generator including a memory and processor configured to generate an image representing a supplementary annotation added document by superimposing a supplementary annotation text layer on an original document image layer configured from an original document image, the supplementary annotation text layer including each supplementary annotation placed at a position corresponding to a position determined in the original document image by the position determining component.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×