Annotating media content for automatic content understanding

US 10,056,112 B2
Filed: 04/19/2017
Issued: 08/21/2018
Est. Priority Date: 04/24/2012
Status: Active Grant

First Claim

Patent Images

1. A system, comprising:

a pattern recognition system to generate, according to a set of input parameters, pattern recognition metadata associated with video frames of a media stream;

an encoder system to generate proposed annotation data associated with the video frames of the media stream by merging the pattern recognition metadata with ground-truth metadata associated with the video frames of the media stream; and

an optimization system to adjust the set of input parameters of the pattern recognition system to minimize a single distance metric including a combination of a plurality of distance metrics by type and to generate the plurality of distance metrics by type are by comparing each type of a plurality of ground-truth metadata types to a corresponding type of a plurality of pattern recognition metadata types, wherein one type of the plurality of pattern recognition metadata types is spatial position.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system for annotating frames in a media stream 114 includes a pattern recognition system (PRS) 108 to generate PRS output metadata for a frame; an archive 106 for storing ground truth metadata (GTM); a device to merge the GTM and PRS output metadata and thereby generate proposed annotation data (PAD) 110; and a user interface 109 for use by the human annotator HA 118. The user interface 104 includes an editor 111 and an input device 107 used by the HA 118 to approve GTM for the frame. An optimization system 105 receives the approved GTM and metadata output by the PRS 108, and adjusts input parameters for the PRS to minimize a distance metric corresponding to a difference between the GTM and PRS output metadata.

75 Citations

19 Claims

1. A system, comprising:
- a pattern recognition system to generate, according to a set of input parameters, pattern recognition metadata associated with video frames of a media stream;
  
  an encoder system to generate proposed annotation data associated with the video frames of the media stream by merging the pattern recognition metadata with ground-truth metadata associated with the video frames of the media stream; and
  
  an optimization system to adjust the set of input parameters of the pattern recognition system to minimize a single distance metric including a combination of a plurality of distance metrics by type and to generate the plurality of distance metrics by type are by comparing each type of a plurality of ground-truth metadata types to a corresponding type of a plurality of pattern recognition metadata types, wherein one type of the plurality of pattern recognition metadata types is spatial position.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The system of claim 1, further comprising an archive to store the ground-truth metadata associated with the video frames of the media stream.
  - 3. The system of claim 1, further comprising a user interface to review the proposed annotation data for the video frames, wherein the user interface includes an editor and an input device.
  - 4. The system of claim 3, wherein the ground-truth metadata associated with the video frames of the media stream is obtained from third party metadata, an archived media stream, the user interface, or any combination thereof.
  - 5. The system of claim 3, wherein the proposed annotation data is approved via the user interface to generate the ground-truth metadata associated with the video frames of the media stream.
  - 6. The system of claim 3, wherein the user interface overlays the proposed annotation data onto the media stream.
  - 7. The system of claim 1, wherein a time delay between third party metadata and the media stream is corrected by an alignment process.
  - 8. The system of claim 1, wherein the optimization system adjusts the set of input parameters of the pattern recognition system to minimize differences between the ground-truth metadata associated with the video frames of the media stream and the pattern recognition metadata.
  - 9. The system of claim 1, wherein the plurality of distance metrics are weighted by type while generating the single distance metric.

10. A method, comprising:
- generating, by a pattern recognition system, according to a set of input parameters, pattern recognition metadata associated with video frames of a media stream;
  
  generating, by an encoder system, proposed annotation data associated with the video frames of the media stream by merging the pattern recognition metadata with ground-truth metadata that is associated with the video frames of the media stream; and
  
  adjusting, by an optimization system, the set of input parameters of the pattern recognition system to minimize a single distance metric including combination of a plurality of distance metrics by type and to generate the plurality of distance metrics by type by comparing each type of a plurality of ground-truth metadata types to a corresponding type of a plurality of pattern recognition metadata types.
- View Dependent Claims (11, 12, 13, 14, 15, 16)
- - 11. The method of claim 10, wherein one type of the plurality of pattern recognition metadata types is spatial position.
  - 12. The method of claim 10, further comprising reviewing, by a user interface, the proposed annotation data from the video frames, wherein the user interface includes an editor and an input device.
  - 13. The method of claim 12, further comprising overlaying, by the user interface, the proposed annotation data onto the media stream.
  - 14. The method of claim 10, wherein a time delay between third party metadata and the media stream is corrected by an alignment process.
  - 15. The method of claim 10, further comprising adjusting, by the optimization system, the set of input parameters of the pattern recognition system to minimize differences between the ground-truth metadata associated with the video frames of the media stream and the pattern recognition metadata.
  - 16. The method of claim 10, wherein the plurality of distance metrics are weighted by type while generating the single distance metric.

17. A non-transitory machine-readable storage medium, comprising executable instructions that, when executed by a processing system including a processor, facilitate performance of operations, comprising:
- generating, via a patterned recognition system and according to a set of input parameters, pattern recognition metadata associated with video frames of a media stream;
  
  generating proposed annotation data associated with the video frames of the media stream by merging the pattern recognition metadata with ground-truth metadata associated with the video frames of the media stream; and
  
  adjusting the set of input parameters to minimize a single distance metric including combination of a plurality of distance metrics by type and to generate the plurality of distance metrics by type by comparing each type of a plurality of ground-truth metadata types to a corresponding type of a plurality of pattern recognition metadata types.
- View Dependent Claims (18, 19)
- - 18. The non-transitory machine-readable storage medium of claim 17, wherein the operations further comprise overlaying the proposed annotation data onto the media stream via a user interface.
  - 19. The non-transitory machine-readable storage medium of claim 17, wherein the operations further comprise adjusting the set of input parameters of the pattern recognition system to minimize differences between the ground-truth metadata associated with the video frames of the media stream and the pattern recognition metadata.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
LiveClips, LLC
Original Assignee
LiveClips, LLC
Inventors
Petajan, Eric David, Weite, David Eugene, Vunic, Douglas W.
Primary Examiner(s)
Hunter, Mishawn

Application Number

US15/491,031
Publication Number

US 20170221523A1
Time in Patent Office

489 Days
Field of Search
US Class Current
CPC Class Codes

G06F 16/48   Retrieval characterised by ...

G06F 40/169   Annotation, e.g. comment da...

G11B 27/036   Insert-editing

G11B 27/19   by using information detect...

G11B 27/28   by using information signal...

H04N 21/23418   involving operations for an...

H04N 21/23424   involving splicing one cont...

H04N 21/84   Generation or processing of...

H04N 21/854   Content authoring

Annotating media content for automatic content understanding

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

75 Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Annotating media content for automatic content understanding

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

75 Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links