OPTICAL CHARACTER RECOGNITION OF CONNECTED CHARACTERS
First Claim
1. A computer-implemented method for optical character recognition of a language script, the method being executed by one or more processors, the method comprising:
- receiving, by the one or more processors, an image comprising a graphical representation of a word written in the language script;
segmenting, by the one or more processors, the word into two or more segments, each segment being determined based on one or more of a variation in a height of the word and a variation in a width of the word, and comprising at least one character;
providing, by the one or more processors, a boundary for each segment of the two or more segments, the boundary enclosing the at least one character of a respective segment, each boundary having an edge with respect to an axis of the image;
normalizing, by the one or more processors, boundaries of the two or more segments by aligning edges of the boundaries; and
labeling, by the one or more processor, each segment of the two or more segments with a respective label, the respective label indicating a language character within the respective boundary.
1 Assignment
0 Petitions
Accused Products
Abstract
Implementations for optical character recognition of a language script can include actions of receiving an image comprising a graphical representation of a word written in the language script, segmenting the word into two or more segments, each segment being determined based on one or more of a variation in a height of the word and a variation in a width of the word, and including at least one character, providing a boundary for each segment of the two or more segments, the boundary enclosing the at least one character of a respective segment, each boundary having an edge with respect to an axis of the image, normalizing boundaries of the two or more segments by aligning edges of the boundaries, and labeling each segment of the two or more segments with a respective label, the respective label indicating a language character within the respective boundary.
4 Citations
30 Claims
-
1. A computer-implemented method for optical character recognition of a language script, the method being executed by one or more processors, the method comprising:
-
receiving, by the one or more processors, an image comprising a graphical representation of a word written in the language script; segmenting, by the one or more processors, the word into two or more segments, each segment being determined based on one or more of a variation in a height of the word and a variation in a width of the word, and comprising at least one character; providing, by the one or more processors, a boundary for each segment of the two or more segments, the boundary enclosing the at least one character of a respective segment, each boundary having an edge with respect to an axis of the image; normalizing, by the one or more processors, boundaries of the two or more segments by aligning edges of the boundaries; and labeling, by the one or more processor, each segment of the two or more segments with a respective label, the respective label indicating a language character within the respective boundary. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A non-transitory computer-readable storage medium coupled to one or more processors and having instructions stored thereon which, when executed by the one or more processors, cause the one or more processors to perform operations for optical character recognition of a language script, the operations comprising:
-
receiving an image comprising a graphical representation of a word written in the language script; segmenting the word into two or more segments, each segment being determined based on one or more of a variation in a height of the word and a variation in a width of the word, and comprising at least one character; providing a boundary for each segment of the two or more segments, the boundary enclosing the at least one character of a respective segment, each boundary having an edge with respect to an axis of the image; normalizing boundaries of the two or more segments by aligning edges of the boundaries; and labeling each segment of the two or more segments with a respective label, the respective label indicating a language character within the respective boundary. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. A system, comprising:
-
a computing device; and a computer-readable storage device coupled to the computing device and having instructions stored thereon which, when executed by the computing device, cause the computing device to perform operations for optical character recognition of a language script, the operations comprising; receiving an image comprising a graphical representation of a word written in the language script, segmenting the word into two or more segments, each segment being determined based on one or more of a variation in a height of the word and a variation in a width of the word, and comprising at least one character, providing a boundary for each segment of the two or more segments, the boundary enclosing the at least one character of a respective segment, each boundary having an edge with respect to an axis of the image, normalizing boundaries of the two or more segments by aligning edges of the boundaries, and labeling each segment of the two or more segments with a respective label, the respective label indicating a language character within the respective boundary. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30)
-
Specification