Method and apparatus for detecting and interpreting textual captions in digital video signals
First Claim
1. A computer-implemented method for the identification and interpretation of text captions in an encoded video stream of digital video signals, said method comprising:
- sampling by selecting frames for video analysis;
decoding by converting each of said frames selected into a digitized color image;
performing edge detection for generating a gray scale image;
binarizing by converting said gray scale image into a bi-level image by means of a thresholding operation;
compressing groups of consecutive pixel values in said binary image;
mapping said consecutive pixel values into a binary value; and
separating groups of connected pixels and determining whether they are likely to be part of a text region in the image or not.
2 Assignments
0 Petitions
Accused Products
Abstract
A computer-implemented method for the identification and interpretation of text captions in an encoded video stream of digital video signals comprises sampling by selecting frames for video analysis, decoding by converting each of frames selected into a digitized color image, performing edge detection for generating a grey scale image, binarizing by converting the grey scale image into a bi-level image by means of a thresholding operation, compressing groups of consecutive pixel values in the binary image, mapping the consecutive pixel values into a binary value, and separating groups of connected pixels and determining whether they are likely to be part of a text region in the image or not.
108 Citations
24 Claims
-
1. A computer-implemented method for the identification and interpretation of text captions in an encoded video stream of digital video signals, said method comprising:
-
sampling by selecting frames for video analysis; decoding by converting each of said frames selected into a digitized color image; performing edge detection for generating a gray scale image; binarizing by converting said gray scale image into a bi-level image by means of a thresholding operation; compressing groups of consecutive pixel values in said binary image; mapping said consecutive pixel values into a binary value; and separating groups of connected pixels and determining whether they are likely to be part of a text region in the image or not. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A computer-implemented method for the identification and interpretation of text captions in a video stream wherein the frame sequence is compressed, comprising the steps of:
-
determining whether the frame number divided by a predetermined number N is an integer, discarding non-integers; decoding compressed frames so as to result in uncompressed frames; detecting edges so as to derive a corresponding gray scale image; binarizing said gray scale image so as to derive a binary image; compressing said binary image so as to derive a compressed binary image; and performing a connected component analysis. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
-
21. A computer-implemented method for the identification and interpretation of text captions in an encoded video stream of digital video signals, said method comprising:
-
sampling by selecting frames for video analysis; decoding by converting each of frames selected into a digitized color image; separating each said digitized color image into three color images corresponding to three color planes; performing edge detection on each of said color planes for generating a respective gray scale image for each of said color planes; applying a thresholding image to each of said gray scale images so as to produce three respective binary edge images; combining said three binary edge images to obtain a single combined binary edge image; compressing groups of consecutive pixel values in said combined binary image; mapping said consecutive pixel values into a binary value; and separating groups of connected pixels and determining whether they are likely to be part of a text region in the image or not. - View Dependent Claims (22, 23)
-
-
24. A computer-implemented, method for the identification and interpretation of text captions which are embedded in one or more of a plurality of contiguous frames of encoded digital video signals, said method comprising:
-
sampling by selecting one or more of said frames for video analysis, said sampling being at a sampling rate fixed at 1 frame per N, where N is the number of consecutive frames in which a given text caption is expected to appear; decoding by converting each of said frames selected into a digitized color image; performing edge detection for generating a gray scale image; binarizing by converting said gray scale image into a bi-level image by means of a thresholding operation; compressing groups of consecutive pixel values in said binary image; mapping said consecutive pixel values into a binary value; and separating groups of connected pixels and determining whether they are likely to be part of a text region embedded in the image or not.
-
Specification