Automatic caption text detection and processing for digital images
First Claim
1. A text detection system for DCT compressed images comprising:
- a calculation mechanism for calculating variations of a first energy in said DCT compressed images whereby predetermined values of said first energy indicate possible text areas in said DCT compressed images;
a threshold mechanism for screening for said predetermined values and outputting potential text areas;
a detecting mechanism for detecting variations of a second energy in said potential text areas indicative of the possibility of text in said potential text areas and outputting detected text areas; and
a reconstruction mechanism for decompressing said DCT compressed images detected text areas.
9 Assignments
0 Petitions
Accused Products
Abstract
A texture-based text localization system proceeds directly in the compressed domain for DCT compressed JPEG images or MPEG videos. The DCT coefficient values in JPEG images and MPEG videos, which capture the directionality and periodicity of local image blocks, are used as texture feature measures to classify text areas. Each unit block in the compressed images is classified as either text or nontext. In addition, post-processing in both the compressed domain and the reconstructed candidate text areas can be used to refine the results. For video frames that contain text, the displacement of text between two consecutive frames is estimated which gives the velocity of the moving text. This temporal displacement information is also used to further refine the localization results. The text is then processed to provide content or speech output.
-
Citations
20 Claims
-
1. A text detection system for DCT compressed images comprising:
-
a calculation mechanism for calculating variations of a first energy in said DCT compressed images whereby predetermined values of said first energy indicate possible text areas in said DCT compressed images;
a threshold mechanism for screening for said predetermined values and outputting potential text areas;
a detecting mechanism for detecting variations of a second energy in said potential text areas indicative of the possibility of text in said potential text areas and outputting detected text areas; and
a reconstruction mechanism for decompressing said DCT compressed images detected text areas. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
second threshold means for screening probable text areas where the calculated vertical edge components provide a density of horizontally aligned edges greater than a predetermined threshold value.
-
-
8. The text detection system as claimed in claim 1 including:
-
temporal adjustment means mechanism to adjust consecutive frame images containing said detected text areas;
an estimation mechanism for estimating the displacement of text in two consecutive frame images of said detected text areas; and
an elimination mechanism for eliminating said two consecutive frames where there are no corresponding detected text areas.
-
-
9. The text detection system as claimed in claim 8 including:
a third threshold mechanism for screening for frame images having corresponding text areas before and after a predetermined frame image.
-
10. The text detection system as claimed in claim 1 including:
an output for outputting said text from said detected text areas; and
processing means for providing a text content from said output text.
-
11. The text detection system as claimed in claim 1 including:
an output for outputting said text from said detected text areas; and
synthesizing means for providing speech output from said output text.
-
12. A method of text detection for DCT compressed images comprising the steps of:
-
calculating variations of a first energy in said DCT compressed images whereby predetermined values of said first energy indicate possible text areas in said DCT compressed images;
screening for said predetermined values and outputting potential text areas related thereto;
deleting variations of a second energy in said potential text areas indicative of the possibility of text in said potential text areas and outputting detected text areas; and
decompressing said DCT compressed image detected text areas. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20)
calculating variations of said first energy includes the step of calculating a horizontal text energy for each DCT compressed images area.
-
-
14. The method of text detection as claimed in claim 13 wherein said step:
of screening screens DCT compressed images areas below a predetermined horizontal text energy.
-
15. The method of text detection as claimed in claim 12 wherein said potential text areas contain noise and are disconnected, and including a morphological mechanism for removing noisy areas and connecting disconnected text areas.
-
16. The method of text detection as claimed in claim 12 wherein said step of:
screening includes screening for variations of said second energy as a vertical energy and outputting detected text areas having the vertical energy over a predetermined value.
-
17. The method of text detection as claimed in claim 12 wherein said step of:
decompressing said DCT compressed images detected text areas includes the step of calculating vertical edge components in said detected text areas.
-
18. The method of text detection as claimed in claim 17 including the step of:
screening probable text areas where the calculated vertical edge components provide a density of horizontally aligned edges greater than a predetermined threshold value.
-
19. The method of text detection as claimed in claim 12 including the steps of:
-
temporarily adjusting consecutive frame images containing said detected text areas;
estimating a displacement of text in two consecutive frame images of said detected text areas; and
eliminating said two consecutive frames where there are no corresponding detected text areas.
-
-
20. The method of text detection as claimed in claim 19 including the step of:
screening for frame images having corresponding text areas before and after a predetermined frame image.
Specification