Method and apparatus for detecting an orientation of characters in a document image
First Claim
Patent Images
1. An image processing method, comprising:
- using a processor to perform the steps of;
extracting a predetermined number of consecutive characters from a document image;
converting the consecutive characters into a set of symbols using layout information obtained from the consecutive characters; and
detecting an orientation of the consecutive characters based on occurrence probabilities of the set of symbols obtained using a plurality of reference images, each reference image corresponding to one of the predetermined orientations,wherein the plurality of reference images includes four types of reference images, each type corresponding to one of four orientations, the four types of reference images being prepared for at least one of a vertical character line and a horizontal character line.
1 Assignment
0 Petitions
Accused Products
Abstract
An apparatus and method for extracting a predetermined number of consecutive characters from a document image; converting the consecutive characters into a set of symbols using layout information obtained from the consecutive characters; and detecting an orientation of the consecutive characters using occurrence probabilities of the set of symbols.
34 Citations
34 Claims
-
1. An image processing method, comprising:
- using a processor to perform the steps of;
extracting a predetermined number of consecutive characters from a document image; converting the consecutive characters into a set of symbols using layout information obtained from the consecutive characters; and detecting an orientation of the consecutive characters based on occurrence probabilities of the set of symbols obtained using a plurality of reference images, each reference image corresponding to one of the predetermined orientations, wherein the plurality of reference images includes four types of reference images, each type corresponding to one of four orientations, the four types of reference images being prepared for at least one of a vertical character line and a horizontal character line. - View Dependent Claims (2, 3, 4, 5)
- using a processor to perform the steps of;
-
6. An image processing method, comprising:
- using a processor to perform the steps of;
extracting a predetermined number of consecutive characters from a document image; converting the consecutive characters into a set of symbols using layout information obtained from the consecutive characters; and detecting an orientation of the consecutive characters based on occurrence probabilities of the set of symbols obtained using a plurality of reference images, each reference image corresponding to one of the predetermined orientations, wherein the converting step comprises the steps of; quantizing the layout information into a set of integers; and assigning each of the integers with the corresponding symbol to generate the set of symbols. - View Dependent Claims (7)
- using a processor to perform the steps of;
-
8. An image processing method comprising:
- using a processor to perform the steps of;
extracting a predetermined number of consecutive characters from a document image; converting the consecutive characters into a set of symbols using layout information obtained from the consecutive characters; and detecting an orientation of the consecutive characters based on occurrence probabilities of the set of symbols obtained using a plurality of reference images, each reference image corresponding to one of the predetermined orientations, wherein the set of symbols are expressed as 8-bit data.
- using a processor to perform the steps of;
-
9. An image processing method comprising:
- using a processor to perform the steps of;
extracting a predetermined number of consecutive characters from a document image; converting the consecutive characters into a set of symbols using layout information obtained from the consecutive characters; and detecting an orientation of the consecutive characters based on occurrence probabilities of the set of symbols obtained using a plurality of reference images, each reference image corresponding to one of the predetermined orientations, wherein the detecting step comprises the steps of; obtaining a plurality of n-gram models each trained on one of the plurality of reference images; calculating the occurrence probabilities of the set of symbols for each of the predetermined orientations using the n-gram models; and selecting one of the predetermined orientations corresponding to the occurrence probability having a largest value. - View Dependent Claims (10, 11, 12, 13)
- using a processor to perform the steps of;
-
14. An image processing method, comprising:
- using a processor to perform the steps of;
forming one or more character lines based on a plurality of consecutive characters extracted from a document image, each of the character lines having a plurality of consecutive characters; converting the plurality of consecutive characters in the character lines into a set of symbols using layout information obtained from the plurality of consecutive characters in the character line; detecting an orientation of one or more of the character lines based on occurrence probabilities of the set of symbols obtained using a plurality of reference images, each reference image corresponding to one of the predetermined orientations; and determining an orientation of the document image based on the detected orientations of the character lines, wherein the determining step comprises the steps of; calculating, for each of the detected orientations, a number of character lines corresponding to the detected orientations; and selecting one of the detected orientations having a largest number of character lines. - View Dependent Claims (15, 16)
- using a processor to perform the steps of;
-
17. An image processing method, comprising:
- using a processor to perform the steps of;
forming one or more character lines based on a plurality of consecutive characters extracted from a document image, each of the character lines having a plurality of consecutive characters; converting the plurality of consecutive characters in the character lines into a set of symbols using layout information obtained from the plurality of consecutive characters in the character line; detecting an orientation of one or more of the character lines based on occurrence probabilities of the set of symbols obtained using a plurality of reference images, each reference image corresponding to one of the predetermined orientations; and determining an orientation of the document image based on the detected orientations of the character lines, wherein the determining step comprises the steps of; obtaining, for each of the one or more character lines, the number of symbols contained in the character line; and selecting one of the detected orientations based on the obtained number of symbols.
- using a processor to perform the steps of;
-
18. An image processing apparatus, comprising:
-
means for extracting a predetermined number of consecutive characters from a document image; means for converting the predetermined number of consecutive characters into a set of symbols using layout information obtained from the predetermined number of consecutive characters; and means for detecting an orientation of the predetermined number of consecutive characters based on occurrence probabilities of the set of symbols obtained using a plurality of reference images, each reference image corresponding to one of the predetermined orientations, wherein the plurality of reference images includes four types of reference images, each type corresponding to one of four orientations, the four types of reference images being prepared for at least one of a vertical character line and a horizontal character line. - View Dependent Claims (19, 20, 21, 22)
-
-
23. An image processing apparatus comprising:
-
means for extracting a predetermined number of consecutive characters from a document image; means for converting the consecutive characters into a set of symbols using layout information obtained from the consecutive characters; means for detecting an orientation of the consecutive characters based on occurrence probabilities of the set of symbols obtained using a plurality of reference images, each reference image corresponding to one of predetermined orientations; and means for storing a plurality of n-gram models each trained on one of a plurality of reference document images, wherein at least one of the plurality of n-gram models is used by the detecting means to detect the orientation of the consecutive characters.
-
-
24. An image processing system, comprising:
-
a processor; and a storage device configured to store a plurality of instructions which, when activated by the processor, cause the processor to perform an image processing operation comprising; extracting a predetermined number of consecutive characters from a document image; converting the predetermined number of consecutive characters into a set of symbols using layout information obtained from the predetermined number of consecutive characters; and detecting an orientation of the predetermined number of consecutive characters based on occurrence probabilities of the set of symbols obtained using a plurality of reference images, each reference image corresponding to one of the predetermined orientations, wherein the plurality of reference images includes four types of reference images, each type corresponding to one of four orientations, the four types of reference images being prepared for at least one of a vertical character line and a horizontal character line. - View Dependent Claims (25)
-
-
26. A computer readable medium storing computer instructions for performing an image processing operation comprising:
-
extracting a predetermined number of consecutive characters from a document image; converting the predetermined number of consecutive characters into a set of symbols using layout information obtained from the predetermined number of consecutive characters; and detecting an orientation of the predetermined number of consecutive characters based on occurrence probabilities of the set of symbols obtained using a plurality of reference images, each reference image corresponding to one of the predetermined orientations, wherein the plurality of reference images includes four types of reference images, each type corresponding to one of four orientations, the four types of reference images being prepared for at least one of a vertical character line and a horizontal character line. - View Dependent Claims (27)
-
-
28. An image processing method comprising:
- using a processor to perform the steps of;
extracting a predetermined number of consecutive characters from a document image; converting the consecutive characters into a set of symbols using layout information obtained from the consecutive characters; and detecting an orientation of the consecutive characters based on occurrence probabilities of the set of symbols obtained using a plurality of reference images, each reference image corresponding to one of the predetermined orientations, wherein the predetermined number of consecutive characters is three for trigrams.
- using a processor to perform the steps of;
-
29. An image processing apparatus comprising:
-
means for extracting a predetermined number of consecutive characters from a document image; means for converting the consecutive characters into a set of symbols using layout information obtained from the consecutive characters; and means for detecting an orientation of the consecutive characters based on occurrence probabilities of the set of symbols obtained using a plurality of reference images, each reference image corresponding to one of predetermined orientations, wherein the predetermined number of consecutive characters is three for trigrams.
-
-
30. An image processing apparatus comprising:
-
means for extracting a predetermined number of consecutive characters from a document image; means for converting the consecutive characters into a set of symbols using layout information obtained from the consecutive characters; and means for detecting an orientation of the consecutive characters based on occurrence probabilities of the set of symbols obtained using a plurality of reference images, each reference image corresponding to one of predetermined orientations, wherein the plurality of reference images includes four types of reference images, each type corresponding to one of four orientations, the four types of reference images being prepared for at least one of a vertical character line and a horizontal character line.
-
-
31. An image processing system comprising:
-
a processor; and a storage device configured to store a plurality of instructions, which when activated by the processor, cause the processor to perform an image processing operation comprising; extracting a predetermined number of consecutive characters from a document image; converting the predetermined number of consecutive characters into a set of symbols using layout information obtained from the predetermined number of consecutive characters; and detecting an orientation of the predetermined number of consecutive characters based on occurrence probabilities of the set of symbols obtained using a plurality of reference images, each reference image corresponding to one of predetermined orientations, wherein the predetermined number of consecutive characters is three for trigrams.
-
-
32. An image processing system comprising:
-
a processor; and a storage device configured to store a plurality of instructions, which when activiated by the processor, cause the processor to perform an image processing operation comprising; extracting a predetermined number of consecutive characters from a document image; converting the predetermined number of consecutive characters into a set of symbols using layout information obtained from the predetermined number of consecutive characters; and detecting an orientation of the predetermined number of consecutive characters based on occurrence probabilities of the set of symbols obtained using a plurality of reference images, each reference image corresponding to one of predetermined orientations, wherein the plurality of reference images includes four types of reference images, each type corresponding to one of four orientations, the four types of reference images being prepared for at least one of a vertical character line and a horizontal character line.
-
-
33. A computer readable medium storing computer instructions for performing an image processing operation comprising:
-
extracting a predetermined number of consecutive characters from a document image; converting the consecutive characters into a set of symbols using layout information obtained from the consecutive characters; and detecting an orientation of the consecutive characters based on occurrence probabilities of the set of symbols obtained using a plurality of reference images, each reference image corresponding to one of predetermined orientations, wherein the predetermined number of consecutive characters is three for trigrams.
-
-
34. A computer readable medium storing computer instructions for performing an image processing operation comprising:
-
extracting a predetermined number of consecutive characters from a document image; converting the consecutive characters into a set of symbols using layout information obtained from the consecutive characters; and detecting an orientation of the consecutive characters based on occurrence probabilities of the set of symbols obtained using a plurality of reference images, each reference image corresponding to one of predetermined orientations, wherein the plurality of reference images includes four types of reference images, each type corresponding to one of four orientations, the four types of reference images being prepared for at least one of a vertical character line and a horizontal character line.
-
Specification