Scheme for extractions and recognitions of telop characters from video data
First Claim
1. A method for processing video data, comprising the steps of:
- (a) entering each input frame constituting the video data; and
(b) judging whether each input frame entered at the step (a) is a telop character displaying frame in which telop characters are displayed or not, according to edge pairs detected from each input frame by detecting each two adjacent edge pixels for which intensity gradient directions are opposite on a scanning line used in judging an intensity gradient direction at each pixel and for which an intensity difference between said two adjacent edge pixels is within a prescribed range as one edge pair wherein an edge pair is defined as each two neighboring edges in which the gradient directions of the edges are opposite and the intensity change between the edges is small, edge pixels being pixels at which an intensity value locally changes by at least a prescribed amount with respect to a neighboring pixel among a plurality of pixels constituting each input frame.
1 Assignment
0 Petitions
Accused Products
Abstract
A scheme for detecting telop character displaying frames in video image which is capable of suppressing erroneous detection of frames without telop characters due to instability of image features is disclosed. In this scheme, each input frame constituting the video data is entered, and whether each input frame is a telop character displaying frame in which telop characters are displayed or not is judged, according to edge pairs detected from each input frame by detecting each two adjacent edge pixels for which intensity gradient directions are opposite on some scanning line used in judging an intensity gradient direction at each edge pixel and for which an intensity difference between said two adjacent edge pixels is within a prescribed range as one edge pair, edge pixels being pixels at which an intensity value locally changes by at least a prescribed amount with respect to a neighboring pixel among a plurality of pixels constituting each input frame.
79 Citations
42 Claims
-
1. A method for processing video data, comprising the steps of:
-
(a) entering each input frame constituting the video data; and
(b) judging whether each input frame entered at the step (a) is a telop character displaying frame in which telop characters are displayed or not, according to edge pairs detected from each input frame by detecting each two adjacent edge pixels for which intensity gradient directions are opposite on a scanning line used in judging an intensity gradient direction at each pixel and for which an intensity difference between said two adjacent edge pixels is within a prescribed range as one edge pair wherein an edge pair is defined as each two neighboring edges in which the gradient directions of the edges are opposite and the intensity change between the edges is small, edge pixels being pixels at which an intensity value locally changes by at least a prescribed amount with respect to a neighboring pixel among a plurality of pixels constituting each input frame. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
(b1) dividing each input frame into a prescribed number of sub-images without mutual overlap;
(b2) detecting edge pixels at which an intensity value locally changes by at least a prescribed amount with respect to a neighboring pixel among a plurality of pixels constituting each sub-image by applying a prescribed edge detection scheme to each sub-image, so as to obtain an edge image formed by detected edge pixels of each sub-image;
(b3) scanning the edge image and an original gray-scale image of each sub-image along each scanning line in each one of a plurality of prescribed scanning directions, and judging an intensity gradient direction of each edge pixel existing on each scanning line in each scanning direction;
(b4) counting a number of edge pairs in each sub-image along each scanning line in each scanning direction, by counting each two adjacent edge pixels on each scanning line in each scanning direction for which the intensity gradient directions are opposite and for which an intensity difference between said two adjacent edge pixels is within a prescribed range as one edge pair;
(b5) calculating a total number of edge pairs obtained for all the scanning directions in each sub-image;
(b6) detecting each input frame in which a prescribed number for each scanning direction or more of sub-images with the total number of edge pairs greater than a prescribed number are adjacently existing, as a candidate of a telop character displaying frame; and
(b7) repeating the sub-steps (b1) to (b6) for all input frames.
-
-
3. The method of claim 1, further comprising the steps of:
-
(c) deleting all but one of a plurality of frames in which telop characters are displayed, that are existing among telop the character displaying frames detected at the step (b); and
(d) storing all those telop character displaying, frames that are left undeleted by the step (c).
-
-
4. The method of claim 3, wherein the step comprises the sub-steps of:
-
(c1) entering two telop character displaying frames that are adjacent in time, and dividing each edge image formed by edge pixels in each input-telop character displaying frame into a prescribed number of sub-images without mutual overlap, the edge pixels being pixels at which an intensity value locally changes by at least a prescribed amount with respect to a neighboring pixel among a plurality of pixels constituting each input telop character displaying frame;
(c2) counting a total number of AND-edge pixels in each sub-image which are the edge pixels at corresponding positions in both input telop character displaying frames;
(c3) counting a total number of edge pixels in each sub-image of one input telop character displaying frame which is earlier one of the two telop character displaying frames entered at the step (c1); and
(c4) deleting said one input telop character displaying frame when a prescribed number for each scanning direction or more of sub-images with a ratio of the total number of AND-edge pixels with respect to the total number of edge pixels not within a prescribed range are adjacently existing, as an over-detected frame.
-
-
5. The method of claim 1, wherein the step (b) detects the edge pairs as image feature points that appear characteristically at a character portion and Judges the telop character displaying frame from a spatial distribution of the image feature points, and the method further comprises the steps of:
-
(e) calculating a moving distance of telop characters as a whole by matching the image feature points in one frame image in which the appearance of the telop characters is detected at the step (b) and another frame image which is acquired at the step (a) consecutively after said one frame image, superposing said one frame image and said another frame image according to a calculated moving distance, evaluating a rate of overlapping of the image feature points in an overlapping region, comparing the rate of overlapping with a prescribed threshold, and jugding that a rolling telop does not exist when the rate of overlapping is lower than the prescribed threshold or that a rolling telop exists when the rate of overlapping is higher than the prescribed threshold;
(f) calculating a local displacement of each character portion constituting the telop characters by locally matching the image feature points detected at the step (b) in said one frame image and said another frame image, after converting coordinates of either one image such that telop characters commonly displayed in both images spatially overlap by using the moving distance of the telop calculated at the step (e); and
(g) superposing telop characters commonly displayed in both images accurately by correcting the local displacement calculated at the step (f) by applying appropriate geometric conversion to each-character portion from which the local displacement is calculated.
-
-
6. The method of claim 5, wherein the step (e) further comprises the sub-steps of:
-
(e1) producing reference tables registering position information of all the image feature points on each one of lines in a horizontal direction or a vertical direction in said one frame image, using line coordinate values in the vertical direction as indices with respect to lines in the horizontal direction or line coordinate values in the horizontal direction as indicates with respect to lines in the vertical direction;
(e2) calculating differences between positions of all the image feature points on each line in said another frame image and positions of all the image feature points registered for an identical line coordinate value in the reference tables; and
(e3) calculating a histogram of all the differences calculated at the step (e2) for all lines in the horizontal direction or the vertical direction, and obtaining a peak difference of the histogram as the moving distance of the telop characters as a whole in the horizontal direction or the vertical direction.
-
-
7. The method of claim 5, wherein the step (f) further comprises the sub-steps of:
-
(f1) producing reference tables registering position information of all the image feature points on each one of lines in a horizontal direction or a vertical direction in each small block subdividing said one frame image, using line coordinate values in the vertical direction as indices with respect to lines in the horizontal direction or line coordinate values in the horizontal direction as indicates with respect to lines in the vertical direction;
(f2) calculating differences between positions of all the image feature points on each line in said another frame image and positions of all the image feature points registered for an identical line coordinate value in the reference tables produced at the step (f1), for each small block; and
(f3) calculating a histogram of all the differences calculated at the step (f2) for all lines in the horizontal direction or the vertical direction within each small block, and obtaining a peak difference of the histogram as the local displacement of each character portion in the horizontal direction or the vertical direction within in each small block.
-
-
8. The method of claim 5, further comprising the step of:
(h) estimating an image portion with a high probability for having characters appearing therein, before the step (e), such that the steps (e) and (f) are carried out only with respect to said image portion with a high probability of having characters appearing therein within each frame image.
-
9. The method of claim 1, further comprising the steps of:
-
(aa) entering and storing each telop character displaying frame image obtained from a plurality of frame images in color video data, and a plurality of color images in which identical characters as said each telop character displaying frame image are displayed among those frame images which are immediately successive to said each telop character displaying frame image in the color video data;
(bb) generating average color image in which each pixel has intensity, saturation and hue values that are average of intensity, saturation and hue values of pixels at corresponding positions in said plurality of color images entered at the step (aa);
(cc) forming first connected regions each comprising a plurality of pixels which are adjacent in an image space and which have similar intensity values in the average color image produced at the step (bb);
(dd) forming second connected regions each comprising a plurality of pixels which are adjacent in the image space and which have similar saturation values in each first connected region;
(ee) forming third connected regions each comprising a plurality of pixels which are adjacent in the image space and which have similar hue values in each second connected region;
(ff) removing those third connected regions which do not satisfy character region criteria; and
(gg) storing those third connected regions which are remaining after the step (ff) as extracted character region images.
-
-
10. The method of claim 9, wherein the step (cc) further comprises the sub-steps of:
-
(cc1) binarizing intensity values on each scanning horizontal line in the average color image, and extracting provisional character regions by synthesizing results of binarization on all scanning horizontal lines;
(cc2) carrying out a labelling processing for assigning serial numbers to the provisional character regions obtained at the step (cc1) as labels so as to obtain a label image;
(cc3) selecting character region pixels from the provisional character regions by binarizing intensity distribution in a vertical direction within those provisional character regions which are labelled by identical label in the label image obtained by the step (cc2).
-
-
11. The method of claim 10, wherein the step (cc1) further comprises the sub-steps of:
-
(cc11) referring to intensity distribution on one scanning horizontal line in the average color image, and extracting a first connected pixel region as a region in which an intensity value is locally higher than its surrounding portion by at least a first prescribed value on said one scanning horizontal line;
(cc12) detecting the first connected pixel region extracted at the step (cc11) as a provisionally high intensity character region when absolute values of intensity gradients in a horizontal direction at left and right ends of the first connected pixel region on said one scanning horizontal line are both greater than a second prescribed value;
(cc13) checking intensity distribution on said one scanning horizontal line, and extracting a second connected pixel region as a region in which an intensity value is locally lower than its surrounding portion by at least a third prescribed value on said one scanning horizontal line; and
(cc14) detecting the second connected pixel region extracted at the step (cc13) as a provisionally low intensity character region when absolute values of intensity gradients in the horizontal direction at left and right ends of the second connected pixel region on said one scanning horizontal line are both greater than a fourth prescribed value.
-
-
12. The method of claim 10, wherein the step (cc3) further comprises the sub-steps of:
-
(cc31) calculating an average intensity value on scanning horizontal line in a range in which pixels within a prescribed pixel width from left and right ends of each identically labelled region are excluded, for each scanning horizontal line in each identically labelled region in the label image;
(cc32) checking distribution in the vertical direction of the average intensity values on scanning horizontal line obtained for all scanning horizontal lines at the step (cc31) and extracting a first connected region of a plurality of scanning horizontal lines in which the average intensity value on scanning horizontal line is locally higher than its surrounding portion by at least a first prescribed value within each identically labelled region in the label image;
(cc33) determining the first connected region extracted at the step (cc32) as a high intensity character region when intensity gradients in the vertical direction of the average intensity values on scanning horizontal line obtained at the step (cc31) at top and bottom ends of the first connected region are both greater than a second prescribed value within each identically labelled region in the label image;
(cc34) checking distribution in the vertical direction of the average intensity values on scanning horizontal line obtained for all scanning horizontal lines at the step (cc31) and extracting a second connected region of a plurality of scanning horizontal lines in which the average intensity value on scanning horizontal line is locally lower than its surrounding portion by at least a third prescribed value within each identically labelled region in the label image;
(cc35) determining the second connected region extracted at the step (cc34) as a low intensity character region when intensity gradients in the vertical direction of the average intensity values on scanning horizontal line obtained at the step (cc31) at top and bottom ends of the second connected region are both greater than a fourth prescribed value within each identically labelled region in the label image.
-
-
13. The method of claim 1, further comprising the steps of:
-
(aaa) dividing character patterns that are binarized into black pixels and white pixels into divided regions each containing one character;
(bbb) normalizing position and size of a character in each divided region;
(ccc) dividing a character pattern of each normalized character into mesh regions;
(ddd) counting a run-length of white pixels which are adjacent in each direction starting from a white pixel existing in each divided mesh region, for a plurality of prescribed directions;
(eee) calculating a direction contributivity of each direction as a value obtained by averaging the run-length in each direction by an accumulated value of all the run-lengths for all the prescribed directions as counted by the step (ddd), for each divided mesh region;
(fff) calculating a feature value of each divided mesh region by accumulating the direction contributivity of each direction for all white pixels in each divided mesh region and averaging an accumulated value of the direction contributivity of each direction by a number of white pixels within each mesh region; and
(ggg) carrying out a processing for recognizing the character pattern of each normalized character using the feature values obtained for all the mesh regions at the step (fff).
-
-
14. The method of claim 1, further comprising the steps of:
-
(aaa) dividing character patterns that are binarized into black pixels and white pixels into divided regions each containing one character;
(bbb) normalizing position and size of a character in each divided region;
(ccc) dividing a character pattern of each normalized character into mesh regions;
(ddd) counting a run-length of black pixels which are adjacent in each direction starting from a black pixel existing in each divided mesh region, for a plurality of prescribed directions;
(eee) calculating a black pixel direction contributivity of each direction as a value obtained by averaging the run-length of black pixels in each direction by an accumulated value of all the run-lengths of black pixels for all the prescribed directions as counted by the step (ddd), for each divided mesh region;
(fff) counting a run-length of white pixels which are adjacent in each direction starting from a white pixel existing in each divided mesh region, for a plurality of prescribed directions;
(ggg) calculating a white pixel direction contributivity of each direction as a value obtained by averaging the run-length of white pixels in each direction by an accumulated value of all the run-lengths of white pixels for all the prescribed directions as counted by the step (fff), for each divided mesh region;
(hhh) calculating a black pixel feature value of each divided mesh region by accumulating the black pixel direction contributivity of each direction for all black pixels in each divided mesh region and averaging an accumulated value of the black pixel direction contributivity of each direction by a number of black pixels within each mesh region; and
(iii) calculating a white pixel feature value of each divided mesh region by accumulating the white pixel direction contributivity of each direction for all white pixels in each divided mesh region and averaging an accumulated value of the white pixel direction contributivity of each direction by a number of white pixels within each mesh region; and
(jjj) carrying out a processing for recognizing the character pattern of each normalized character using the black feature values obtained for all the mesh regions at the step (hhh) and the white feature values obtained for all the mesh regions at the step (iii).
-
-
15. An apparatus for processing video data, comprising:
-
(a) a unit for entering each input frame constituting the video data; and
(b) a unit for judging whether each input frame entered at the unit (a) is a telop character displaying frame in which telop characters are displayed or not, according to edge pairs detected from each input frame by detecting each two adjacent edge pixels for which intensity gradient directions are opposite on a scanning line used in judging an intensity gradient direction at each edge pixel and for which an intensity difference between said two adjacent edge pixels is within a prescribed range as one edge pair wherein an edge pair is defined as each two neighboring edges in which the gradient directions of the edges are opposite and the intensity change between the edges is small, edge pixels being pixels at which an intensity value locally changes by at least a prescribed amount with respect to a neighboring pixel among a plurality of pixels constituting each input frame. - View Dependent Claims (16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28)
(b1) a unit for dividing each input frame into a prescribed number of sub-images without mutual overlap;
(b2) a unit for detecting edge pixels at which an intensity value locally changes by at least a prescribed amount with respect to a neighboring pixel among a plurality of pixels constituting each sub-image by applying a prescribed edge detection scheme to each sub-image, so as to obtain an edge image formed by detected edge pixels of each sub-image;
(b3) a unit for scanning the edge image and an original gray-scale image of each sub-image along each scanning line in each one of a plurality of prescribed scanning directions, and judging an intensity gradient direction of each edge pixel existing on each scanning line in each scanning direction;
(b4) a unit for counting a number of edge pairs in each sub-image along each scanning line in each scanning direction, by counting each two adjacent edge pixels on each scanning line in each scanning direction for which the intensity gradient directions are opposite and for which an intensity difference between said two adjacent edge pixels is within a prescribed range as one edge pair;
(b5) a unit for calculating a total number of edge pairs obtained for all the scanning directions in each sub-image;
(b6) a unit for detecting each input frame in which a prescribed number for each scanning direction or more of sub-images with the total number of edge pairs greater than a prescribed number are adjacently existing, as a candidate of a telop character displaying frame; and
(b7) a unit for repeating operations of the units (b1) to (b6) for all input frames.
-
-
17. The apparatus of claim 15, further comprising:
-
(c) a unit for deleting all but one of a plurality of frames in which identical telop characters are displayed, that are existing among telop character displaying frames detected at the unit (b); and
(d) a unit for storing all those telop character displaying frames that are left undeleted by unit (c).
-
-
18. The apparatus of claim 17, wherein the unit (c) further comprises:
-
(c1) a unit for entering two telop character displaying frames that are adjacent in time, and dividing each edge image formed by edge pixels in each input telop character displaying frame into a prescribed number of sub-images without mutual overlap, the edge pixels being pixels at which an intensity value locally changes by at least a prescribed amount with respect to a neighboring pixel among a plurality of pixels constituting each input telop character displaying frame;
(c2) a unit for counting a total number of AND-edge pixels in each sub-image which are the edge pixels at corresponding positions in both input telop character displaying frames;
(c3) a unit for counting a total number of edge pixels in each sub-image of one input telop character displaying frame which is earlier one of the two telop character displaying frames entered at the unit (c1); and
(c4) a unit for deleting said one input telop character displaying frame when a prescribed number for each scanning direction or more of sub-images with a ratio of the total number of AND-edge pixels with respect to the total number of edge pixels not within a prescribed range are adjacently existing, as an over-detected frame.
-
-
19. The apparatus of claim 15, wherein the unit (b) detects the edge pairs as image feature points that appear characteristically at a character portion and judges the telop character displaying frame from a spatial distribution of the image feature points, and the apparatus further comprises:
-
(e) a unit for calculating a moving distance of telop characters as a whole by matching the image feature points in one frame image in which the appearance of the telop characters is detected at the unit (b) and another frame image which is entered at the unit (a) consecutively after said one frame image, superposing said one frame image and said another frame image according to a calculated moving distance, evaluating a rate of overlapping of the image feature points in an overlapping region, comparing the rate of overlapping with a prescribed threshold, and jugding that a rolling telop does not exist when the rate of overlapping is lower than the prescribed threshold or that a rolling telop exists when the rate of overlapping is higher than the prescribed threshold;
(f) a unit for calculating a local displacement of each character portion constituting the telop characters by locally matching the image feature points detected at the unit (b) in said one frame image and said another frame image, after converting coordinates of either one image such that telop characters commonly displayed in both images spatially overlap by using the moving distance of the telop calculated at the unit (e); and
(g) a unit for superposing telop characters commonly displayed in both images accurately by correcting the local displacement calculated at the unit (f) by applying appropriate geometric conversion to each character portion from which the local displacement is calculated.
-
-
20. The apparatus of claim 19, wherein the unit (e) further comprises:
-
(e1) a unit for producing reference tables registering position information of all the image feature points on each one of lines in a horizontal direction or a vertical direction in said one frame image, using line coordinate values in the vertical direction as indices with respect to lines in the horizontal direction or line coordinate values in the horizontal direction as indicates with respect to lines in the vertical direction;
(e2) a unit for calculating differences between positions of all the image feature points on each line in said another frame image and positions of all the image feature points registered for an identical line coordinate value in the reference tables; and
(e3) a unit for calculating a histogram of all the differences calculated at the unit (e2) for all lines in the horizontal direction or the vertical direction, and obtaining a peak difference of the histogram as the moving distance of the telop characters as a whole in the horizontal direction or the vertical direction.
-
-
21. The apparatus of claim 19, wherein the unit (f) further comprises:
-
(f1) a unit for producing reference tables registering position information of all the image feature points on each one of lines in a horizontal direction or a vertical direction in each small block subdividing said one frame image, using line coordinate values in the vertical direction as indices with respect to lines in the horizontal direction or line coordinate values in the horizontal direction as indicates with respect to lines in the vertical direction;
(f2) a unit for calculating differences between positions of all the image feature points on each line in said another frame image and positions of all the image feature points registered for an identical line coordinate value in the reference tables produced at the unit (f1), for each small block; and
(f3) a unit for calculating a histogram of all the differences calculated at the unit (f2) for all lines in the horizontal direction or the vertical direction within each small block, and obtaining a peak difference of the histogram as the local displacement of each character portion in the horizontal direction or the vertical direction within each small block.
-
-
22. The apparatus of claim 19, further comprising:
(h) a unit for estimating an image portion with a high probability for having characters appearing therein, before an operation of the unit (e), such that operations of the units (e) and (f) are carried out only with respect to said image portion with a high probability of having characters appearing therein within each frame image.
-
23. The apparatus of claim 15, further comprising:
-
(aa) a unit for entering and storing each telop character displaying frames image obtained from a plurality of frame images in color video data, and a plurality of color images in which identical characters as said each telop character displaying frame image are displayed among those frame images which are immediately successive to said each telop character displaying frame image in the color video data;
(bb) a unit for generating average color image in which each pixel has intensity, saturation and hue values that are average of intensity, saturation and hue values of pixels at corresponding positions in said plurality of color images entered at the unit (aa);
(cc) a unit for forming first connected regions each comprising a plurality of pixels which are adjacent in an image space and which have similar intensity values in the average color image produced at the unit (bb);
(dd) a unit for forming second connected regions each comprising a plurality of pixels which are adjacent in the image space and which have similar saturation values in each first connected region;
(ee) a unit for forming third connected regions each comprising a plurality of pixels which are adjacent in the image space and which have similar hue values in each second connected region;
(ff) a unit for,removing those third connected regions which do not satisfy character region criteria; and
(gg) a unit for storing those third connected regions which are remaining after the unit (ff) as extracted character region images.
-
-
24. The apparatus of claim 23, wherein the unit (cc) further comprises:
-
(cc1) a unit for binarizing intensity values on each scanning horizontal line in the average color image, and extracting provisional character regions by synthesizing results of binarization on all scanning horizontal lines;
(cc2) a unit for carrying out a labelling processing for assigning serial numbers to the provisional character regions obtained at the unit (cc1) as labels so as to obtain a label image;
(cc3) a unit for selecting character region pixels from the provisional character regions by binarizing intensity distribution in a vertical direction within those provisional character regions which are labelled by identical label in the label image obtained by the unit (cc2).
-
-
25. The apparatus of claim 24, wherein the unit (cc1) further comprises:
-
(cc11) a unit for referring to intensity distribution on one scanning horizontal line in the average color image, and extracting a first connected pixel region as a region in which an intensity value is locally higher than its surrounding portion by at least a first prescribed value on said one scanning horizontal line;
(cc12) a unit for detecting the first connected pixel region extracted at the unit (cc11) as a provisionally high intensity character region when absolute values of intensity gradients in a horizontal direction at left and right ends of the first connected pixel region on said one scanning horizontal line are both greater than a second prescribed value;
(cc13) a unit for checking intensity distribution on said one scanning horizontal line, and extracting a second connected pixel region as a region in which an intensity value is locally lower than its surrounding portion by at least a third prescribed value on said one scanning horizontal line; and
(cc14) a unit for detecting the second connected pixel region extracted at the unit (cc13) as a provisionally low intensity character region when absolute values of intensity gradients in the horizontal direction at left and right ends of the second connected pixel region on said one scanning horizontal line are both greater than a fourth prescribed value.
-
-
26. The apparatus of claim 24, wherein the unit (cc3) further comprises:
-
(cc31) a unit for calculating an average intensity value on scanning horizontal line in a range in which pixels within a prescribed pixel width from left and right ends of each identically labelled region are excluded, for each scanning horizontal line in each identically labelled region in the label image;
(cc32) a unit for checking Distribution in the vertical direction of the average intensity values on scanning horizontal line obtained for all scanning horizontal lines at the unit (cc31) and extracting a first connected region of a plurality of scanning horizontal lines in which the average intensity value on scanning horizontal line is locally higher than its surrounding portion by at least a first prescribed value within each identically labelled region in the label image;
(cc33) a unit for determining the first connected region extracted at the unit (cc32)as a high intensity character region when intensity gradients in the vertical direction of the average intensity values on scanning horizontal line obtained at the unit (cc31) at top and bottom ends of the first connected region are both greater than a second prescribed value within each identically labelled region in the label image;
(cc34) a unit for checking distribution in the vertical direction of the average intensity values on scanning horizontal line obtained for all scanning horizontal lines at the unit (cc31) and extracting a second connected region of a plurality of scanning horizontal lines in which the average intensity value on scanning horizontal line is locally lower than its surrounding portion by at least a third prescribed value within each identically labelled region in the label image;
(cc35) a unit for determining the second connected region extracted at the unit (cc34) as a low intensity character region when intensity gradients in the vertical direction of the average intensity values on scanning horizontal line obtained at the unit (cc31) at top and bottom ends of the second connected,region are both greater than a fourth prescribed value within each identically labelled region in the label image.
-
-
27. The apparatus of claim 15, further comprising:
-
(aaa) a unit for dividing character patterns that are binarized into black pixels and white pixels into divided regions each containing one character;
(bbb) a unit for normalizing position and size of a character in each divided region;
(ccc) a unit for dividing a character pattern of each normalized character into mesh regions;
(ddd) a unit for counting a run-length of white pixels which are adjacent in each direction starting from a white pixel existing in each divided mesh region, for a plurality of prescribed directions;
(eee) a unit for calculating a direction contributivity of each direction as a value obtained by averaging the run-length in each direction by an accumulated value of all the run-lengths for all the prescribed directions as counted by the unit (ddd), for each divided mesh region;
(fff) a unit for calculating a feature value of each divided mesh region by accumulating the direction contributivity of each direction for all white pixels in each divided mesh region and averaging an accumulated value of the direction contributivity of each direction by a number of white pixels within each mesh region; and
(ggg) a unit for carrying out a processing for recognizing the character pattern of each normalized character using the feature values obtained for all the mesh regions at the unit (fff).
-
-
28. The apparatus of claim 15, further comprising:
-
(aaa) a unit for dividing character patterns that are binarized into black pixels and white pixels into divided regions each containing one character;
(bbb) a unit for normalizing position and size of a character in each divided region;
(ccc) a unit for dividing a character pattern of each normalized character into mesh regions;
(ddd) a unit for counting a run-length of black pixels which are adjacent in each direction starting from a black pixel existing in each divided mesh region, for a plurality of prescribed directions;
(eee) a unit for calculating a black pixel direction contributivity of each direction as a value obtained by averaging the run-length of black pixels in each direction by an accumulated value of all the run-lengths of black pixels for all the prescribed directions as counted by the unit (ddd), for each divided mesh region;
(fff) a unit for counting a run-length of white pixels which are adjacent in each direction starting from a white pixel existing in each divided mesh region, for a plurality of prescribed directions;
(ggg) a unit for calculating a white pixel direction contributivity of each direction as a value obtained by averaging the run-length of white pixels in each direction by an accumulated value of all the run-lengths of white pixels for all the prescribed directions as counted by the unit (fff), for each divided mesh region;
(hhh) a unit for calculating a black pixel feature value of each divided mesh region by accumulating the black pixel direction contributivity of each direction for all black pixels in each divided mesh region and averaging an accumulated value of the black pixel direction contributivity of each direction by a number of black pixels within each mesh region; and
(iii) a unit for calculating a white pixel feature value of each divided mesh region by accumulating the white pixel direction contributivity of each direction for all white pixels in each divided mesh region and averaging an accumulated value of the white pixel direction contributivity of each direction by a number of white pixels within each mesh region; and
(jjj) a unit for carrying out a processing for recognizing the character pattern of each normalized character using the black feature values obtained for all the mesh regions at the unit (hhh) and the white feature values obtained for all the mesh regions at the unit (iii).
-
-
29. The computer readable recording medium recording a program for causing a computer to execute processing including:
-
(a) a process for entering each input frame constituting the video data; and
(b) a process for judging whether each input frame entered at the process (a) is a telop character displaying frame in which telop characters are displayed or not, according to edge pairs detected from each input frame by detecting each two adjacent edge pixels for which intensity gradient directions are opposite on a scanning line used in judging an intensity gradient direction at each edge pixel and for which an intensity difference between said two adjacent edge pixels is within a prescribed range as one edge pair wherein an edge pair is defined as each two neighboring edges in which the gradient directions of the edges are opposite and the intensity change between the edges is small, edge pixels being pixels at which an intensity value locally changes by at least a prescribed amount with respect to a neighboring pixel among a plurality of pixels constituting each input frame. - View Dependent Claims (30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42)
(b1) a process for dividing each input frame into a prescribed number of sub-images without mutual overlap;
(b2) a process for detecting edge pixels at which an intensity value locally changes by at least a prescribed amount with respect to a neighboring pixel among a plurality of pixels constituting each sub-image by applying a prescribed edge detection scheme to each sub-image, so as to obtain an edge image formed by detected edge pixels of each sub-image;
(b3) a process for scanning the edge image and an original gray-scale image of each sub-image along each scanning line in each one of a plurality of prescribed scanning directions, and judging an intensity gradient direction of each edge pixel existing on each scanning line in each scanning direction;
(b4) a process for counting a number of edge pairs in each sub-image along each scanning line in each scanning direction, by counting each two adjacent edge pixels on each scanning line in each scanning direction for which the intensity gradient directions are opposite and for which an intensity difference between said two adjacent edge pixels is within a prescribed range as one edge pair;
(b5) a process for calculating a total number of edge pairs obtained for all the scanning directions in each sub-image;
(b6) a process for detecting each input frame in which a prescribed number for each scanning direction or more of sub-images with the total number of edge pairs greater than a prescribed number are adjacently existing, as a candidate of a telop character displaying frame; and
(b7) a process for repeating operations of the processes (b1) to (b6) for all input frames.
-
-
31. The recording medium of claim 29, wherein the processing further includes:
-
(c) a process for deleting all but one of a plurality of frames in which identical telop characters are displayed, that are existing among telop character displaying frames detected at the process (b); and
(d) a process for storing all those telop character displaying frames that are left undeleted by the process (c).
-
-
32. The recording medium of claim 31, wherein the process (c) further includes:
-
(c1) a process for entering two telop character displaying frames that are adjacent in time, and dividing each edge image formed by edge pixels in each input telop character displaying frame into a prescribed number of sub-images without mutual overlap, the edge pixels being pixels at which an intensity value locally changes by at least a prescribed amount with respect to a neighboring pixel among a plurality of pixels constituting each input telop character displaying frame;
(c2) a process for counting a total number of AND-edge pixels in each sub-image which are the edge pixels at corresponding portions in both input telop character displaying frames;
(c3) a process for counting a total number of edge pixels in each sub-image of one input telop character displaying frame which is earlier one of the two telop character displaying frames entered at the process (c1); and
(c4) a process for deleting said one input telop character displaying frame when a prescribed number for each scanning direction or more of sub-images with a ratio of the total number of AND-edge pixels with respect to the total number of edge pixels not within a prescribed range are adjacently existing, as an over-detected frame.
-
-
33. The recording medium of claim 29, wherein the process (b) detects the edge pairs as image feature points that appear characteristically at a character portion and judges the telop character displaying frame from a spatial distribution of the image feature points, and the processing further includes:
-
(e) a process for calculating a moving distance of telop characters as a whole by matching the image feature points in one frame image in which the appearance of the telop characters is detected at the process (b) and another frame image which is entered at the process (a) consecutively after said one frame image, superposing said one frame image and said another frame image according to a calculated moving distance, evaluating a rate of overlapping of the image feature points in an overlapping region, comparing the rate of overlapping with a prescribed threshold, and jugding that a rolling telop does not exist when the rate of overlapping is lower than the prescribed threshold or that a rolling telop exists when the rate of overlapping is higher than the prescribed threshold;
(f) a process for calculating a local displacement of each character portion constituting the telop characters by locally matching the image feature points detected at the process (b) in said one frame image and said another frame image, after converting coordinates of either one image such that telop characters commonly displayed in both images spatially overlap by using the moving distance of the telop calculated at the process (e); and
(g) a process for superposing telop characters commonly displayed in both images accurately by correcting the local displacement calculated at the process (f) by applying appropriate geometric conversion to each character portion from which the local displacement is calculated.
-
-
34. The recording medium of claim 33, wherein the process (e) further includes:
-
(e1) a process for producing reference tables registering position information of all the image feature points on each one of lines in a horizontal direction or a vertical direction in said one frame image, using line coordinate values in the vertical direction as indices with respect to lines in the horizontal direction or line coordinate values in the horizontal direction as indicates with respect to lines in the vertical direction;
(e2) a process for calculating differences between positions of all the image feature points on each line in said another frame image and positions of all the image feature points registered for an identical line coordinate value in the reference tables; and
(e3) a process for calculating a histogram of all the differences calculated at the process (e2) for all lines in the horizontal direction or the vertical direction, and obtaining a peak difference of the histogram as the moving distance of the telop characters as a whole in the horizontal direction or the vertical direction.
-
-
35. The recording medium of claim 33, wherein the process (f) further includes:
-
(f1) a process for producing reference tables registering position information of all the image feature points on each one of lines in a horizontal direction or a vertical direction in each small block subdividing said one frame image, using line coordinate values in the vertical direction as indices with respect to lines in the horizontal direction or line coordinate values in the horizontal direction as indicates with respect to lines in the vertical direction;
(f2) a process for calculating differences between positions of all the image feature points on each line in said another frame image and positions of all the image feature points registered for an identical line coordinate value in the reference tables produced at the process (f1), for each small block; and
(f3) a process for calculating a histogram of all the differences calculated at the process (f2) for all lines in the horizontal direction or the vertical direction within each small block, and obtaining a peak difference of the histogram as the local displacement of each character portion in the horizontal direction or the vertical direction within each small block.
-
-
36. The recording medium of claim 33, wherein the processing further includes:
(h) a process for estimating an image portion with a high probability for having characters appearing therein, before an operation of the process (d), such that operations of the processes (e) and (f) are carried out only with respect to said image portion with a high probability of having characters appearing therein within each frame image.
-
37. The recording medium of claim 29, wherein the processing further includes:
-
(aa) a process for entering and storing each telop character displaying frames image obtained from a plurality of frame images in color video data, and a plurality of color images in which identical characters as said each telop character displaying frame image are displayed among those frame images which are immediately successive to said each telop character displaying frame image in the color video data;
(bb) a process for generating average color image in which each pixel has intensity, saturation and hue values that are average of intensity, saturation and hue values of pixels at corresponding positions in said plurality of color images entered at the process (aa);
(cc) a process for forming first connected regions each comprising a plurality of pixels which are adjacent in an image space and which have similar intensity values in the average color image produced at the process (bb);
(dd) a process for forming second connected regions each comprising a plurality of pixels which are adjacent in the image space and which have similar saturation values in each first connected region;
(ee) a process for forming third connected regions each comprising a plurality of pixels which are adjacent in the image space and which have similar hue values in each second connected region;
(ff) a process for removing those third connected regions which do not satisfy character region criteria; and
(gg) a process for storing those third connected regions which are remaining after the process (ff) as extracted character region images.
-
-
38. The recording medium of claim 37, wherein the process (cc) further includes:
-
(cc1) a process for binarizing intensity values on each scanning horizontal line in the average color image, and extracting provisional character regions by synthesizing results of binarization on all scanning horizontal lines;
(cc2) a process for carrying out a labelling processing for assigning serial numbers to the provisional character regions obtained at the process (cc1) as labels so as to obtain a label image;
(cc3) a process for selecting character region pixels from the provisional character regions by binarizing intensity distribution in a vertical direction within those provisional character regions which are labelled by identical label in the label image obtained by the process (cc2).
-
-
39. The recording medium of claim 38, wherein the process (cc1) further includes:
-
(cc11) a process for referring to intensity distribution on one scanning horizontal line in the average color image, and extracting a first connected pixel region as a region in which an intensity value is locally higher than its surrounding portion by at least a first prescribed value on said one scanning horizontal line;
(cc12) a process for detecting the first connected pixel region extracted at the process (cc11) as a provisionally high intensity character region when absolute values of intensity gradients in a horizontal direction at left and right ends of the first connected pixel region on said one scanning horizontal line are both greater than a second prescribed value;
(cc13) a process for checking intensity distribution on said one scanning horizontal line, and extracting a second connected pixel region as a region in which an intensity value is locally lower than its surrounding portion by at least a third prescribed value on said one scanning horizontal line; and
(cc14) a process for detecting the second connected pixel region extracted at the process (cc13) as a provisionally low intensity character region when absolute values of intensity gradients in the horizontal direction at left and right ends of the second connected pixel region on said one scanning horizontal line are both greater than a fourth prescribed value.
-
-
40. The recording medium of claim 38, wherein the process (cc3) further includes:
-
(cc31) a process for calculating an average intensity value on scanning horizontal line in a range in which pixels within a prescribed pixel width from left and right ends of each identically labelled region are excluded, for each scanning horizontal line in each identically labelled region in the label image;
(cc32) a process for checking distribution in the vertical direction of the average intensity values on scanning horizontal line obtained for all scanning horizontal lines at the process (cc31) and extracting a first connected region of a plurality of scanning horizontal lines in which the average intensity value on scanning horizontal line is locally higher than its surrounding portion by at least a first prescribed value within each identically labelled region in the label image;
(cc33) a process for determining the first connected region extracted at the process (cc32) as a high intensity character region when intensity gradients in the vertical direction of the average intensity values on scanning horizontal line obtained at the process (cc31) at top and bottom ends of the first connected region are both greater than a second prescribed value within each identically labelled region in the label image;
(cc34) a process for checking distribution in the vertical direction of the average intensity values on scanning horizontal line obtained for all scanning horizontal lines at the process (cc31) and extracting a second connected region of a plurality of scanning horizontal lines in which the average intensity value on scanning horizontal line is locally lower than its surrounding portion by at least a third prescribed value within each identically labelled region in the label image;
(cc35) a process for determining the second connected region extracted at the process (cc34) as a low intensity character region when intensity gradients in the vertical direction of the average intensity values on scanning horizontal line obtained at the process (cc31) at top and bottom ends of the second connected region are both greater than a fourth prescribed value within each identically labelled region in the label image.
-
-
41. The recording medium of claim 29, wherein the processing further includes:
-
(aaa) a process for dividing character patterns that are binarized into black pixels and white pixels into divided regions each containing one character;
(bbb) a process for normalizing position and size of a character in each divided region;
(ccc) a process for dividing a character pattern of each normalized character into mesh regions;
(ddd) a process for counting a run-length of white pixels which are adjacent in each direction starting from a white pixel existing in each divided mesh region, for a plurality of prescribed directions;
(eee) a process for calculating a direction contributivity of each direction as a value obtained by averaging the run-length in each direction by an accumulated value of all the run-lengths for all the prescribed directions as counted by the process (ddd), for each divided mesh region;
(fff) a process for calculating a feature value of each divided mesh region by accumulating the direction contributivity of each direction for all white pixels in each divided mesh region and averaging an accumulated value of the direction contributivity of each direction by a number of white pixels within each mesh region; and
(ggg) a process for carrying out a processing for recognizing the character pattern of each normalized character using the feature values obtained for all the mesh regions at the process (fff).
-
-
42. The recording medium of claim 29, wherein the processing further includes:
-
(aaa) a process for dividing character patterns that are binarized into black pixels and white pixels into divided regions each containing one character;
(bbb) a process for normalizing position and size of a character in each divided region;
(ccc) a process for dividing a character pattern of each normalized character into mesh regions;
(ddd) a process for counting a run-length of black pixels which are adjacent in each direction starting from a black pixel existing in each divided mesh region, for a plurality of prescribed directions;
(eee) a process for calculating a black pixel direction contributivity of each direction as a value obtained by averaging the run-length of black pixels in each direction by an accumulated value of all the run-lengths of black pixels for all the prescribed directions as counted by the process (ddd), for each divided mesh region;
(fff) a process for counting a run-length of white pixels which are adjacent in each direction starting from a white pixel existing in each divided mesh region, for a plurality of prescribed directions;
(ggg) a process for calculating a white pixel direction contributivity of each direction as a value obtained by averaging the run-length of white pixels in each direction by an accumulated value of all the run-lengths of white pixels for all the prescribed directions as counted by the process (fff), for each divided mesh region;
(hhh) a process for calculating a black pixel feature value of each divided mesh region by accumulating the black pixel direction contributivity of each direction for all black pixels in each divided mesh region and averaging an accumulated value of the black pixel direction contributivity of each direction by a number of black pixels within each mesh region; and
(iii) a process for calculating a white pixel feature value of each divided mesh region by accumulating the white pixel direction contributivity of each direction for all white pixels in each divided mesh region and averaging an accumulated value of the white pixel direction contributivity of each direction by a number of white pixels within each mesh region; and
(jjj) a process for carrying out a processing for recognizing the character pattern of each normalized character using the black feature values obtained for all the mesh regions at the process (hhh) and the white feature values obtained for all the mesh regions at the process (iii).
-
Specification