Method and apparatus for caption detection
First Claim
Patent Images
1. A method comprising:
- detecting a plurality of text boxes from a plurality of video frames, wherein detecting includesobtaining a first percentage of the plurality of text boxes whose locations associated with the plurality of the video frames fall within a location range,wherein the first percentage and the location range are regarded as acceptable if the first percentage is equal to or greater than a first determined value and if the location range is equal to or less than a second predetermined value, andobtaining a second percentage of the plurality of text boxes whose sizes fall within a size range,wherein the second percentage and the size range are regarded as acceptable if the second percentage is equal to or greater than a third predetermined value and if the size range is equal to or less than a fourth predetermined value;
identifying a text box of the plurality of text boxes as a caption candidate if the first percentage, the location range, the second percentage, and the size range relating to the text box are acceptable; and
selecting the identified text box as the caption candidate.
1 Assignment
0 Petitions
Accused Products
Abstract
Machine-readable media, methods, apparatus and system for caption detection are described. In some embodiments, a plurality of text boxes may be detected from a plurality of frames. A first percentage of the plurality of text boxes whose locations on the plurality of frames fall into a location range may be obtained. A second percentage of the plurality of text boxes whose sizes fall into a size range may be obtained. Then, it may be determined if the first percentage and the location range are acceptable and if the second percentage and the size range are acceptable.
-
Citations
20 Claims
-
1. A method comprising:
-
detecting a plurality of text boxes from a plurality of video frames, wherein detecting includes obtaining a first percentage of the plurality of text boxes whose locations associated with the plurality of the video frames fall within a location range, wherein the first percentage and the location range are regarded as acceptable if the first percentage is equal to or greater than a first determined value and if the location range is equal to or less than a second predetermined value, and obtaining a second percentage of the plurality of text boxes whose sizes fall within a size range, wherein the second percentage and the size range are regarded as acceptable if the second percentage is equal to or greater than a third predetermined value and if the size range is equal to or less than a fourth predetermined value; identifying a text box of the plurality of text boxes as a caption candidate if the first percentage, the location range, the second percentage, and the size range relating to the text box are acceptable; and selecting the identified text box as the caption candidate. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. An apparatus comprising:
a computing device having a storage medium to store instructions, and a processing device to execute the instructions, the computing device further having a mechanism to, when the instructions are executed, perform one or more operations comprising; detecting a plurality of text boxes from a plurality of video frames, wherein detecting includes obtaining a first percentage of the plurality of text boxes whose locations associated with the plurality of the video frames fall within a location range, wherein the first percentage and the location range are regarded as acceptable if the first percentage is equal to or greater than a first determined value and if the location range is equal to or less than a second predetermined value, and obtaining a second percentage of the plurality of text boxes whose sizes fall within a size range, wherein the second percentage and the size range are regarded as acceptable if the second percentage is equal to or greater than a third predetermined value and if the size range is equal to or less than a fourth predetermined value; identifying a text box of the plurality of text boxes as a caption candidate if the first percentage, the location range, the second percentage, and the size range relating to the text box are acceptable; and selecting the identified text box as the caption candidate. - View Dependent Claims (9, 10, 11, 12, 13)
-
14. A machine-readable medium having stored thereon instructions, which when executed, cause a processing device to perform one or more operations comprising:
-
detecting a plurality of text boxes from a plurality of video frames, wherein detecting includes obtaining a first percentage of the plurality of text boxes whose locations associated with the plurality of the video frames fall within a location range, wherein the first percentage and the location range are regarded as acceptable if the first percentage is equal to or greater than a first determined value and if the location range is equal to or less than a second predetermined value, and obtaining a second percentage of the plurality of text boxes whose sizes fall within a size range, wherein the second percentage and the size range are regarded as acceptable if the second percentage is equal to or greater than a third predetermined value and if the size range is equal to or less than a fourth predetermined value; identifying a text box of the plurality of text boxes as a caption candidate if the first percentage, the location range, the second percentage, and the size range relating to the text box are acceptable; and selecting the identified text box as the caption candidate. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification