Annotating media content for automatic content understanding
First Claim
1. A non-transitory machine-readable storage medium, comprising executable instructions that, when executed by a processing system including a processor, facilitate performance of operations, comprising:
- performing pattern recognition on video frames of a media stream to generate pattern recognition metadata associated with the video frames of the media stream;
comparing the pattern recognition metadata associated with the video frames of the media stream and ground-truth metadata associated with the video frames of the media stream to generate a single distance metric, wherein the ground-truth metadata and the pattern recognition metadata comprise a first type of metadata;
adjusting a set of input parameters associated with the pattern recognition according to the single distance metric, wherein the single distance metric is weighted according to the first type; and
merging the pattern recognition metadata associated with the video frames of the media stream with ground-truth metadata associated with the video frames of the media stream to generate proposed annotation data associated with the video frames of the media stream.
3 Assignments
0 Petitions
Accused Products
Abstract
A system for annotating frames in a media stream 114 includes a pattern recognition system (PRS) 108 to generate PRS output metadata for a frame; an archive 106 for storing ground truth metadata (GTM); a device to merge the GTM and PRS output metadata and thereby generate proposed annotation data (PAD) 110; and a user interface 109 for use by the human annotator HA 118. The user interface 104 includes an editor 111 and an input device 107 used by the HA 118 to approve GTM for the frame. An optimization system 105 receives the approved GTM and metadata output by the PRS 108, and adjusts input parameters for the PRS to minimize a distance metric corresponding to a difference between the GTM and PRS output metadata.
82 Citations
20 Claims
-
1. A non-transitory machine-readable storage medium, comprising executable instructions that, when executed by a processing system including a processor, facilitate performance of operations, comprising:
-
performing pattern recognition on video frames of a media stream to generate pattern recognition metadata associated with the video frames of the media stream; comparing the pattern recognition metadata associated with the video frames of the media stream and ground-truth metadata associated with the video frames of the media stream to generate a single distance metric, wherein the ground-truth metadata and the pattern recognition metadata comprise a first type of metadata; adjusting a set of input parameters associated with the pattern recognition according to the single distance metric, wherein the single distance metric is weighted according to the first type; and merging the pattern recognition metadata associated with the video frames of the media stream with ground-truth metadata associated with the video frames of the media stream to generate proposed annotation data associated with the video frames of the media stream. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A system, comprising:
-
a pattern recognition system, including a processing system comprising a processor, that facilitates generating pattern recognition metadata associated with video frames of a media stream according to a set of input parameters associate with the pattern recognition system; an optimization system, including the processing system, that facilitates comparing the pattern recognition metadata associated with the video frames of the media stream and ground-truth metadata associated with the video frames of the media stream to generate a single distance metric, wherein the ground-truth metadata and the pattern recognition metadata comprise a first type of metadata, and wherein the set of input parameters associated with the pattern recognition system are adjusted according to the single distance metric; and an encoder system, including the processing system, that facilitates generating proposed annotation data associated with the video frames of the media stream by merging the pattern recognition metadata with ground-truth metadata associated with the video frames of the media stream. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A method, comprising:
-
performing, by a processing system including a processor, pattern recognition on video frames of a media stream to generate pattern recognition metadata associated with the video frames of the media stream; comparing, by the processing system, the pattern recognition metadata associated with the video frames of the media stream and ground-truth metadata associated with the video frames of the media stream to generate a single distance metric, wherein the ground-truth metadata and the pattern recognition metadata comprise a first type of metadata; and adjusting, by the processing system, a set of input parameters associated with the pattern recognition according to the single distance metric, wherein the single distance metric is weighted according to the first type. - View Dependent Claims (18, 19, 20)
-
Specification