Using shape suppression to identify areas of images that include particular shapes
First Claim
Patent Images
1. A system to identify areas of text in a video image, the system comprising:
- a segmentation module to, receive a vertical edge map of the video image, determine the edge density of a plurality of portions of the vertical edge map, identify, based on the edge densities, one or more candidate text areas of the video image that may contain text;
a shape classifier, coupled to the segmentation module, to recognize a plurality of shapes that may occur in the text; and
a shape suppression module, coupled to the shape classifier, to, receive the one or more candidate text areas, remove, from the one or more candidate text areas, vertical edges that are not recognized as text shapes by the shape classifier, and remove an area from the one or more candidate text areas if the area has a low edge density.
2 Assignments
0 Petitions
Accused Products
Abstract
Shape suppression is used to identify areas of images that include particular shapes. According to one embodiment, a Vector Quantization (VQ)-based shape classifier is designed to identify the vertical edges of a set of shapes (e.g., English letters and numbers). A shape suppression filter is applied to the candidate areas, which are identified from a vertical edge map according to the edge density, to remove the vertical edges which are not classified as characteristic of shapes. Areas with high enough edge density after the filtering are identified as potential areas of the image that include one or more of the set of shapes.
49 Citations
61 Claims
-
1. A system to identify areas of text in a video image, the system comprising:
-
a segmentation module to, receive a vertical edge map of the video image, determine the edge density of a plurality of portions of the vertical edge map, identify, based on the edge densities, one or more candidate text areas of the video image that may contain text;
a shape classifier, coupled to the segmentation module, to recognize a plurality of shapes that may occur in the text; and
a shape suppression module, coupled to the shape classifier, to, receive the one or more candidate text areas, remove, from the one or more candidate text areas, vertical edges that are not recognized as text shapes by the shape classifier, and remove an area from the one or more candidate text areas if the area has a low edge density. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
identifying one of the plurality of portions having an edge density that exceeds a threshold value; and
repeatedly increasing the size of the portion by an additional amount as long as the edge density exceeds the threshold value.
-
-
4. A system as recited in claim 3, wherein the segmentation module is further to identify the candidate text area by repeatedly decreasing the size of the portion by an amount less than the additional amount as long as the edge density of the amount does not exceed the threshold value.
-
5. A system as recited in claim 1, wherein the segmentation module is further to output a selected edge map which includes only edges in the identified one or more candidate text areas.
-
6. A system as recited in claim 1, wherein the system further includes a differential module to receive the video image and to use a differential filter to identify the vertical edges in the video image.
-
7. A system as recited in claim 6, wherein the differential filter comprises one of:
- a Sobel filter, a Prewitt filter, or a Kirsch filter.
-
8. A system as recited in claim 6, wherein the system further includes a thinning module coupled to receive the identified vertical edges from the differential module and apply a non-maxima suppression filter to the identified vertical edges to generate a vertical edge map and output the vertical edge map to the segmentation module.
-
9. A system as recited in claim 1, wherein the segmentation module is further to count a number of edges in each of the plurality of portions of the vertical edge map to determine the edge density of that portion.
-
10. A system as recited in claim 1, wherein the segmentation module is further to adjust the size of an identified area by analyzing portions of the identified area and leaving the portion as part of the identified area if the number of vertical edges in the portion exceeds a first threshold, and removing the portion from the identified area if the number of vertical edges in the portion does not exceed the first threshold.
-
11. A system as recited in claim 1, wherein the system further includes a differential module to receive the video image and to use a differential filter to identify horizontal edges in the video image.
-
12. A system as recited in claim 11, wherein the differential filter comprises one of:
- a Sobel filter, a Prewitt filter, or a Kirsch filter.
-
13. A system as recited in claim 11, wherein the system further includes a thinning module coupled to receive the identified horizontal edges from the differential module and apply a non-maxima suppression filter to the identified horizontal edges to generate a horizontal edge map.
-
14. A system as recited in claim 13, wherein the system further includes:
a horizontal alignment module to receive the candidate text areas from the segmentation module, and to determine, for each of the areas, whether the area is to remain a candidate text area based at least in part on the horizontal edge map.
-
15. A system as recited in claim 14, wherein the horizontal alignment module is further to output each of the candidate text areas to the shape suppression module.
-
16. A system as recited in claim 14, wherein the horizontal alignment module is further to analyze, for each of the candidate text areas, an upper portion and a lower portion of the area, and to keep the area as a candidate text area if a number of horizontal edges in the upper portion exceeds a threshold and if the number of horizontal edges in the lower portion exceeds the threshold, otherwise to not include the area as a candidate text area.
-
17. A system as recited in claim 16, wherein the width of each of the upper and lower portions for an identified area is the width of the candidate text area, and wherein the height of each of the upper and lower portions is three pixels.
-
18. A system as recited in claim 17, wherein the threshold is equal to one-tenth of the total number of pixels in the upper portion.
-
19. A system as recited in claim 1, wherein the shape classifier comprises a Vector Quantization (VQ)-based Bayesian classifier.
-
20. A system as recited in claim 1, wherein the shape classifier is to recognize shapes as text shapes by comparing characteristics of edges in the candidate text areas to characteristics of the plurality of shapes.
-
21. A method comprising:
-
receiving an image;
identifying edges in the image;
identifying, based on the identified edges, candidate text areas which may contain shapes; and
removing, from the candidate text areas, edges which are not recognized as characteristic of one or more shapes. - View Dependent Claims (22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34)
determining an edge density of portions of a vertical edge map including at least some of the identified edges;
identifying, based on the edge densities of the portions, one or more areas of the image which may contain shapes; and
using a horizontal edge map including other of the identified edges to both verify and adjust the identified areas.
-
-
24. A method as recited in claim 21, further comprising removing a candidate text area with a low edge density from being a candidate text area.
-
25. A method as recited in claim 21, wherein the shapes comprise one or more of:
- letters, numbers, symbols, punctuation marks, and ideograms.
-
26. A method as recited in claim 21, wherein the one or more shapes include English language alphanumerics and punctuation.
-
27. A method as recited in claim 21, wherein the image comprises a video frame.
-
28. A method as recited in claim 21, wherein the removing is performed based on whether edges are recognized as being characteristic of a shape by a Vector Quantization (VQ)-based Bayesian classifier.
-
29. A method as recited in claim 21, wherein the identifying edges comprises:
-
using a differential filter to generate a vertical edge map identifying vertical edge locations in the image; and
using a non-maxima suppression filter to generate a thinned edge map identifying the vertical edges in the image.
-
-
30. A method as recited in claim 21, wherein the identifying edges comprises using a Sobel filter to generate a vertical edge map identifying vertical edge locations in the image.
-
31. A method as recited in claim 23, wherein the determining an edge density for a portion comprises:
dividing a number of pixels in the portion which represent edges by the total number of pixels in the portion.
-
32. A method as recited in claim 21, wherein the identifying edges in the image comprises:
-
using a differential filter to generate a horizontal edge map identifying horizontal edge locations in the image; and
using a non-maxima suppression filter to generate a thinned edge map identifying the vertical edges in the image.
-
-
33. A method as recited in claim 21, wherein the identifying edges comprises using a Sobel filter to generate a horizontal edge map identifying horizontal edge locations in the image.
-
34. One or more computer-readable memories containing a computer program that is executable by a processor to perform the method recited in claim 21.
-
35. A method of identifying areas in a video image that include characters, the method comprising:
-
receiving an edge map identifying edges in the video image;
identifying, based on the edge map, candidate text areas which may include characters; and
removing, from the candidate text areas, edges which are not recognized as characteristic of one or more characters. - View Dependent Claims (36, 37, 38)
determining an edge density of portions of a vertical edge map including at least some of the identified edges;
identifying, based on the edge densities of the portions, one or more areas of the image which may contain characters; and
using a horizontal edge map including other of the identified edges to both verify and adjust the identified areas.
-
-
37. A method as recited in claim 35, further comprising removing a candidate text area with a low edge density from being a candidate text area.
-
38. A method as recited in claim 35, wherein the removing is performed based on whether edges are recognized as being characteristic of a shape by a Vector Quantization (VQ)-based Bayesian classifier.
-
39. One or more computer-readable media having stored thereon a computer program to identify areas in a video image that include symbols, wherein the computer program, when executed by one or more processors, causes the one or more processors to perform acts including:
-
identifying a plurality of vertical edges in the video image;
generating a vertical thinned edge map based on the plurality of vertical edges;
identifying a plurality of horizontal edges in the video image;
generating a horizontal thinned edge map based on the plurality of horizontal edges;
determining an edge density of portions of the vertical thinned edge map;
identifying, as candidate text areas, portions of the vertical thinned edge map having an edge density exceeding a first threshold value;
determining a horizontal edge density of portions of the horizontal thinned edge map;
removing, from the identified candidate text areas, each area having a horizontal edge density in an upper portion and a lower portion that is lower than a second threshold value;
adjusting the upper and lower boundaries of each remaining candidate text area based on the horizontal edges in the upper portion and lower portion of the area;
removing, from each of the remaining candidate text areas, vertical edges which are not recognized as characteristic of shapes;
removing areas from the remaining candidate text areas having an edge density less than a third threshold value; and
outputting any remaining candidate text areas as a set of one or more text areas. - View Dependent Claims (40, 41, 42, 43, 44, 45, 46, 47)
-
-
48. A method comprising:
-
receiving an identification of edges in a video image;
comparing the edges to a plurality of known shapes; and
outputting an identification of edges which are recognized as matching one of the plurality of known shapes, further comprising identifying, as areas of text, areas of the video image including the edges which are recognized as matching one of the plurality of known shapes. - View Dependent Claims (49, 50, 51)
-
-
52. One or more computer-readable memories containing a computer program that is executable by a processor to perform acts of:
-
identifying edges in a video image;
identifying, as candidate areas of text, areas of a video image including the edges which are recognized as matching one of a plurality of known shapes; and
removing, from the candidate text areas, edges which are not recognized as characteristic of one or more of the plurality of known shapes. - View Dependent Claims (53, 54, 55, 56, 57, 58, 59, 60, 61)
identifying, based on the identified edges, candidate text areas which may contain one or more of the plurality of known shapes.
-
-
54. One or more computer-readable memories as recited in claim 52, wherein the computer program is executable by a processor to further perform acts of:
identifying, based on the identified edges, candidate text areas which may contain one or more of the plurality of known shapes, wherein identifying edges comprises identifying both horizontal edges and vertical edges in the image.
-
55. One or more computer-readable memories as recited in claim 52, wherein the computer program is executable by a processor to further perform acts of:
-
determining an edge density of portions of a vertical edge map including at least some of the identified edges;
identifying, based on the edge densities of the portions, one or more areas of the image which may contain one or more of the plurality of known shapes; and
using a horizontal edge map including other of the identified edges to both verify and adjust the identified areas.
-
-
56. One or more computer-readable memories as recited in claim 52, wherein the computer program is executable by a processor to further perform acts of removing a candidate text area with a low edge density from being a candidate text area.
-
57. One or more computer-readable memories as recited in claim 52, wherein the computer program is executable by a processor to further perform acts of identifying, based on the identified edges, candidate text areas which may contain shapes comprising one or more of:
- letters, numbers, symbols, punctuation marks, and ideograms.
-
58. One or more computer-readable memories as recited in claim 52, wherein the computer program is executable by a processor to further perform acts of identifying, based on the identified edges, candidate text areas which may contain shapes comprising English language alphanumerics and punctuation.
-
59. One or more computer-readable memories as recited in claim 52, wherein the removing is performed based on whether edges are recognized as being characteristic of a shape by a Vector Quantization (VQ)-based Bayesian classifier.
-
60. One or more computer-readable memories as recited in claim 52, wherein the computer program is executable by a processor to further perform acts of:
-
identifying, based on the identified edges, candidate text areas using a differential filter to generate a vertical edge map identifying vertical edge locations in the image; and
using a non-maxima suppression filter to generate a thinned edge map identifying the vertical edges in the image.
-
-
61. One or more computer-readable memories as recited in claim 52, wherein the computer program is executable by a processor to further perform acts of:
identifying, based on the identified edges, candidate text areas using a Sobel filter to generate a vertical edge map identifying vertical edge locations in the image.
Specification