System and method for extracting text captions from video and generating video summaries
First Claim
1. A method of decoding a caption box in video content comprising:
- determining at least one expected location of a caption box in a frame of the video content;
determining at least one caption box mask within the expected location;
identifying frames in the video content as caption frames if the current frame exhibits substantial correlation to the at least one caption box mask within the expected caption box location;
for at least a portion of the caption frames, identifying word regions within the confines of the expected location;
for each word region,identifying text characters within the region; and
processing the identified text characters.
3 Assignments
0 Petitions
Accused Products
Abstract
Caption boxes which are embedded in video content can be located and the text within the caption boxes decoded. Real time processing is enhanced by locating caption box regions in the compressed video domain and performing pixel based processing operations within the region of the video frame in which a caption box is located. The captions boxes are further refined by identifying word regions within the caption boxes and then applying character and word recognition processing to the identified word regions. Domain based models are used to improve text recognition results. The extracted caption box text can be used to detect events of interest in the video content and a semantic model applied to extract a segment of video of the event of interest.
-
Citations
15 Claims
-
1. A method of decoding a caption box in video content comprising:
-
determining at least one expected location of a caption box in a frame of the video content; determining at least one caption box mask within the expected location; identifying frames in the video content as caption frames if the current frame exhibits substantial correlation to the at least one caption box mask within the expected caption box location; for at least a portion of the caption frames, identifying word regions within the confines of the expected location;
for each word region,identifying text characters within the region; and processing the identified text characters. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A non-transitory computer-readable storage medium storing a program for causing a computer to implement a method of decoding a caption box in video content comprising:
-
determining at least one expected location of a caption box in a frame of the video content; determining at least one caption box mask within the expected location; identifying frames in the video content as caption frames if the current frame exhibits substantial correlation to the at least one caption box mask within the expected caption box location; for at least a portion of the caption frames, identifying word regions within the confines of the expected location; for each word region, identifying text characters within the region; and processing the identified text characters. - View Dependent Claims (8)
-
-
9. A system for decoding a caption box in video content comprising:
-
location means for determining at least one expected location of a caption box in a frame of the video content; determining means, coupled to the location means and receiving the at least one expected location therefrom, for determining at least one caption box mask within the expected location; frame identifying means, coupled to the determining means and location means and receiving the at least one caption box mask and the at least one expected location therefrom, for identifying frames in the video content as caption frames if the current frame exhibits substantial correlation to the at least one caption box mask within the expected caption box location; word region identifying means, coupled to the frame identifying means and receiving the identified caption frames therefrom, for at least a portion of the caption frames, identifying word regions within the confines of the expected location; text character means, coupled to the word region identifying means and receiving the word regions therefrom, for each word region, identifying text characters within the region; and
processing the identified text characters. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
Specification