Method and system for detecting text in raster images
First Claim
1. A method for detecting text in raster images, the method comprising the steps of:
- converting a raster image into a vector representation of the image;
identifying pairs of shapes of similar size and within a predefined distance of one another;
forming shape graphs from the identified shape pairs;
identifying chains of shapes from the formed shape graphs;
determining characteristic chain lines associated with the identified chains of shapes;
straightening the identified chains of shapes into a straight line based on the corresponding chain lines associated with the respective identified chains of shapes; and
classifying the straightened identified chains as text or non-text using an automatic text classifier.
4 Assignments
0 Petitions
Accused Products
Abstract
Systems, methods, and applications for detection text in a raster image include converting a raster image into a vector representation of the image, identifying pairs of shapes of similar size and within a predefined distance of one another, forming shape graphs from the identified shape pairs, identifying chains of shapes from the formed shape graphs, determining characteristic chain lines associated with the identified chains of shapes, straightening the identified chains of shapes into a straight line based on the corresponding chain lines associated with the respective identified chains of shapes, and classifying the straightened identified chains as text or non-text using an automatic text classifier.
9 Citations
26 Claims
-
1. A method for detecting text in raster images, the method comprising the steps of:
-
converting a raster image into a vector representation of the image; identifying pairs of shapes of similar size and within a predefined distance of one another; forming shape graphs from the identified shape pairs; identifying chains of shapes from the formed shape graphs; determining characteristic chain lines associated with the identified chains of shapes; straightening the identified chains of shapes into a straight line based on the corresponding chain lines associated with the respective identified chains of shapes; and classifying the straightened identified chains as text or non-text using an automatic text classifier. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A text classification system comprising:
-
one or more processors configured to receive a raster image; a raster-to-vector converter which converts the raster image into a vector image comprising a vector representation of the received raster image; a shape pair detection engine which identifies pairs of shapes of similar size and within a predefined distance of one another; a chain detection system which forms shape graphs from the identified shape pairs and identifies chains of shapes from the formed shape graphs; a chain line straightening system which determines characteristic chain lines associated with the identified chains of shapes and which straightens the identified chains into a straight line based on the corresponding chain lines associated with the respective identified chains of shapes; and a text classifier which classifies the straightened identified chains as text or non-text. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
-
Specification