Method of perspective correction for devanagari text
First Claim
1. A method to improve automatic recognition of text, the method comprising:
- receiving a plurality of regions in an image of a scene of real world captured by a camera;
rotating at least the plurality of regions through a common angle φ
, to obtain a set of skew-corrected regions;
after the rotating, applying to the set of skew-corrected regions one or more tests that determine presence of text, to identify a subset of regions likely to be text;
after the applying, determining a slant angle θ
of at least a portion of a region, by combining a plurality of angles of a plurality of lines relative to a common direction, each line in the plurality of lines representing multiple line segments in the region that are at least one pixel wide, located adjacent to one another, and formed by pixels of text;
using the slant angle θ
to change first coordinates of at least pixels in the portion, whereby a first height at a first end of the portion and a second height at a second end of the portion remain unchanged after the using; and
storing in a memory, at least changed first coordinates generated by the using;
wherein the receiving, the rotating, the applying, the determining, the using and the storing are performed by one or more processors.
1 Assignment
0 Petitions
Accused Products
Abstract
An electronic device and method identify regions that are likely to be text in a natural image or video frame, followed by processing as follows: lines that are nearly vertical are automatically identified in a selected text region, oriented relative to the vertical axis within a predetermined range −max_theta to +max_theta, followed by determination of an angle θ of the identified lines, followed by use of the angle θ to perform perspective correction by warping the selected text region. After perspective correction in this manner, each text region is processed further, to recognize text therein, by performing OCR on each block among a sequence of blocks obtained by slicing the potential text region. Thereafter, the result of text recognition is used to display to the user, either the recognized text or any other information obtained by use of the recognized text.
19 Citations
20 Claims
-
1. A method to improve automatic recognition of text, the method comprising:
-
receiving a plurality of regions in an image of a scene of real world captured by a camera; rotating at least the plurality of regions through a common angle φ
, to obtain a set of skew-corrected regions;after the rotating, applying to the set of skew-corrected regions one or more tests that determine presence of text, to identify a subset of regions likely to be text; after the applying, determining a slant angle θ
of at least a portion of a region, by combining a plurality of angles of a plurality of lines relative to a common direction, each line in the plurality of lines representing multiple line segments in the region that are at least one pixel wide, located adjacent to one another, and formed by pixels of text;using the slant angle θ
to change first coordinates of at least pixels in the portion, whereby a first height at a first end of the portion and a second height at a second end of the portion remain unchanged after the using; andstoring in a memory, at least changed first coordinates generated by the using; wherein the receiving, the rotating, the applying, the determining, the using and the storing are performed by one or more processors. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A non-transitory computer-readable storage medium comprising a plurality of instructions to at least one processor to improve automatic recognition of text, the plurality of instructions comprising:
-
first instructions to receive a plurality of regions in an image of a scene of real world captured by a camera; second instructions to rotate at least the plurality of regions through a common angle φ
, to obtain a set of skew-corrected regions;to execute after execution of the second instructions to rotate, third instructions to apply to the set of skew-corrected regions one or more tests that determine presence of text, to identify a subset of regions likely to be text; to execute after execution of the third instructions to apply, fourth instructions to determine a slant angle θ
of at least a portion of a region in the subset, by combining a plurality of angles of a plurality of lines relative to a common direction, each line in the plurality of lines representing multiple line segments in the region that are at least one pixel wide, located adjacent to one another, and formed by pixels of text;fifth instructions to use the slant angle θ
to change first coordinates of at least pixels in the portion, whereby a first height at a first end of the portion and a second height at a second end of the portion remain unchanged after execution of the fifth instructions; andsixth instructions to store in a memory, at least changed first coordinates generated by execution of the fifth instructions. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A mobile device comprising:
-
a camera; a memory operatively connected to the camera to receive at least an image therefrom; at least one processor operatively connected to the memory to execute a plurality of instructions stored in the memory; wherein the plurality of instructions cause the at least one processor to; rotate at least the plurality of regions through a common angle φ
, to obtain a set of skew-corrected regions;after rotation through the common angle φ
, apply to the set of skew-corrected regions one or more tests that determine presence of text, to identify a subset of regions likely to be text;after application of the one or more tests, determine a slant angle θ
of at least a portion of a region in the subset, by combining a plurality of angles of a plurality of lines relative to a common direction, each line in the plurality of lines representing multiple line segments in the region that are at least one pixel wide, located adjacent to one another, and formed by pixels of text;use the slant angle θ
to change first coordinates of at least pixels in the portion, whereby a first height at a first end of the portion and a second height at a second end of the portion remain unchanged after the use; andstore in the memory, at least changed first coordinates generated by the use. - View Dependent Claims (16, 17, 18, 19)
-
-
20. An apparatus to improve automatic recognition of text, the apparatus comprising:
-
means for receiving a plurality of regions in an image of a scene of real world captured by a camera; means for rotating at least the plurality of regions through a common angle φ
, to obtain a set of skew-corrected regions;means, operable after rotation through the common angle φ
, for applying to the set of skew-corrected regions one or more tests that determine presence of text, to identify a subset of regions likely to be text;means, operable after application of the one or more tests, for determining a slant angle θ
of at least a portion of a region in the subset, by combining a plurality of angles of a plurality of lines relative to a common direction, each line in the plurality of lines representing multiple line segments in the region that are at least one pixel wide, located adjacent to one another, and formed by pixels of text;means for using the slant angle θ
to change first coordinates of at least pixels in word the portion, whereby a first height at a first end of the portion and a second height at a second end of the portion remain unchanged after operation of the means for the using; andmeans for storing in a memory, at least changed first coordinates generated by the means for using.
-
Specification