SYSTEM AND METHOD FOR DEEP ANNOTATION AND SEMANTIC INDEXING OF VIDEOS

US 20110087703A1
Filed: 10/09/2009
Published: 04/14/2011
Est. Priority Date: 10/09/2009
Status: Abandoned Application

First Claim

Patent Images

1. A method of a deep annotation and a semantic indexing of a multimedia content based on a script, wherein said script is associated with said multimedia content, said method comprising:

determining of a plurality of multimedia scenes of said multimedia content;

determining of a plurality of script segments of said script;

obtaining of a script segment structure associated with a script segment of said plurality of script segments, wherein said script segment structure comprises;

a plurality of objects, a plurality of object descriptions of said plurality of objects, a plurality of persons, a plurality of person descriptions of said plurality of persons, a plurality of locations, a plurality of location descriptions of said plurality of locations, a plurality of scene descriptions, a plurality of dialog descriptions, a plurality of action descriptions, and a plurality of directives;

determining of a plurality of closed-world key phrases based on said script;

determining of a coarse-grained annotation associated with a script segment of said plurality of script segments based on the analysis of a plurality of objects, a plurality of object descriptions of said plurality of objects, a plurality of persons, a plurality of person descriptions of said plurality of persons, a plurality of locations, a plurality of location descriptions of said plurality of locations, a plurality of scene descriptions, a plurality of dialog descriptions, a plurality of action descriptions, and a plurality of directives associated with said script segment;

determining of a plurality of coarse-grained annotations associated with a plurality of multimedia key frames of a multimedia scene of said plurality of multimedia scenes based on said plurality of closed-world key phrases;

determining of a plurality of plurality of matched script segments associated with said plurality of multimedia key frames based on said plurality of script segments and said plurality of coarse-grained annotations;

determining of a best matched script segment associated with said multimedia scene based on said plurality of plurality of matched script segments;

analyzing of said best matched script segment to result in a fine-grained annotation of said multimedia scene;

making of said fine-grained annotation a part of said deep annotation of said multimedia content;

performing of said semantic indexing of said multimedia content based on a fine-grained annotation associated with each of said plurality of multimedia scenes of said multimedia content; and

determining of a plurality of homogeneous scenes of said plurality of multimedia scenes based on said semantic indexing.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Video on demand services rely on frequent viewing and downloading of content to enhance the return on investment on such services. Videos in general and movies in particular hosted by video portals need to have extensive annotations to help in greater monetization of content. Such deep annotations help in creating content packages based on bits and pieces extracted from specific videos suited to individuals'"'"' queries thereby providing multiple opportunities for piece-wise monetization. Considering the complexity involved in extracting deep semantics for deep annotation based on video and audio analyses, a system and method for deep annotation uses video/movie scripts associated with content for supporting video-audio analysis in deep annotation.

35 Citations

View as Search Results

9 Claims

1. A method of a deep annotation and a semantic indexing of a multimedia content based on a script, wherein said script is associated with said multimedia content, said method comprising:
- determining of a plurality of multimedia scenes of said multimedia content;
  
  determining of a plurality of script segments of said script;
  
  obtaining of a script segment structure associated with a script segment of said plurality of script segments, wherein said script segment structure comprises;
  
  a plurality of objects, a plurality of object descriptions of said plurality of objects, a plurality of persons, a plurality of person descriptions of said plurality of persons, a plurality of locations, a plurality of location descriptions of said plurality of locations, a plurality of scene descriptions, a plurality of dialog descriptions, a plurality of action descriptions, and a plurality of directives;
  
  determining of a plurality of closed-world key phrases based on said script;
  
  determining of a coarse-grained annotation associated with a script segment of said plurality of script segments based on the analysis of a plurality of objects, a plurality of object descriptions of said plurality of objects, a plurality of persons, a plurality of person descriptions of said plurality of persons, a plurality of locations, a plurality of location descriptions of said plurality of locations, a plurality of scene descriptions, a plurality of dialog descriptions, a plurality of action descriptions, and a plurality of directives associated with said script segment;
  
  determining of a plurality of coarse-grained annotations associated with a plurality of multimedia key frames of a multimedia scene of said plurality of multimedia scenes based on said plurality of closed-world key phrases;
  
  determining of a plurality of plurality of matched script segments associated with said plurality of multimedia key frames based on said plurality of script segments and said plurality of coarse-grained annotations;
  
  determining of a best matched script segment associated with said multimedia scene based on said plurality of plurality of matched script segments;
  
  analyzing of said best matched script segment to result in a fine-grained annotation of said multimedia scene;
  
  making of said fine-grained annotation a part of said deep annotation of said multimedia content;
  
  performing of said semantic indexing of said multimedia content based on a fine-grained annotation associated with each of said plurality of multimedia scenes of said multimedia content; and
  
  determining of a plurality of homogeneous scenes of said plurality of multimedia scenes based on said semantic indexing.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1, wherein said method of determining of said plurality of closed-world key phrases further comprising:
    - analyzing of a plurality of object descriptions associated with each of said plurality of script segments resulting in a plurality of object key phrases;
      
      analyzing of a plurality of person descriptions associated with each of said plurality of script segments resulting in a plurality of person key phrases;
      
      analyzing of a plurality of location descriptions associated with each of said plurality of script segments resulting in a plurality of location key phrases;
      
      analyzing of a plurality of scene descriptions associated with each of said plurality of script segments resulting in a plurality of scene key phrases;
      
      analyzing of a plurality of dialog descriptions associated with each of said plurality of script segments resulting in a plurality of dialog key phrases;
      
      analyzing of a plurality of action descriptions associated with each of said plurality of script segments resulting in a plurality of action key phrases;
      
      performing of consistency analysis based on said plurality of object key phrases, said plurality of person key phrases, said plurality of location key phrases, said plurality of scene key phrases, said plurality of dialog key phrases, and said plurality of action key phrases to result in said plurality of close-world key phrases.
  - 3. The method of claim 1, wherein said method of determining of said plurality of coarse-grained annotations further comprising:
    - analyzing of said multimedia scene of said plurality of multimedia scenes to result in a plurality of multimedia shots;
      
      analyzing of each of said plurality of multimedia shots to result in a plurality of multimedia key frames;
      
      analyzing of each of said plurality of multimedia key frames based on said plurality of closed-world key phrases to result in a plurality of annotations, wherein said plurality of annotations is a part of said plurality of coarse-grained annotations;
      
      analyzing of said plurality of annotations of said plurality of multimedia key frames to result in a multimedia shot annotation of said multimedia shot of said plurality of multimedia shots; and
      
      analyzing of said multimedia shot annotation associated with each of said plurality of multimedia shots to result in a multimedia scene annotation associated with said multimedia scene.
  - 4. The method of claim 1, wherein said method of determining of said plurality of plurality of matched script segments further comprising:
    - obtaining of a script segment of said plurality segments;
      
      obtaining of a multimedia key frame of said plurality of multimedia key frames;
      
      obtaining of a segment coarse-grained annotation associated with script segment, wherein said segment coarse-grained annotation is a coarse-grained annotation associated with said script segment;
      
      obtaining of a key frame coarse-grained annotation associated with said multimedia key frame based on said plurality of coarse-grained annotations;
      
      determining of a matching factor of said script segment based on said segment coarse-grained annotation and said key frame coarse grained annotation;
      
      determining of a plurality of matching factors based on said plurality of script segments;
      
      arranging of said plurality of script segments based on said plurality of matching factors in the non-increasing order to result in a plurality of arranged script segments; and
      
      making of a pre-defined number of script segments from the top of said plurality of arranged script segments a part of said plurality of plurality of matched script segments.
  - 5. The method of claim 1, wherein said method of determining said best matched script segment further comprising:
    - obtaining of said plurality of plurality of matched script segments;
      
      determining of a plurality of isosegmental lines based on said plurality of plurality of matched script segments;
      
      computing of a plurality of errors, wherein each of said plurality of errors is associated with an isosegmental line of said plurality of isosegmental lines;
      
      selecting of a best isosegmental line based on said plurality of errors;
      
      obtaining of a best script segment associated with said best isosegmental line; and
      
      determining of said best matched script based on said best script segment.
  - 6. The method of claim 5, wherein said method of determining said plurality of isosegmental lines further comprising:
    - determining of a plurality of plurality of positional weights based on said plurality of plurality of matched script segments, wherein a positional weight of said plurality of plurality of positional weights is associated with a script segment of a plurality of matched script segments of said plurality of plurality of matched script segments based on the position of said script segment within said plurality of matched script segments; and
      
      determining of an isosegmental line of said plurality of isosegmental lines, wherein said isosegmental line is associated a plurality of segment positional weights based on said plurality of plurality of positional weights and each of said plurality of segment positional weights is associated with the same segment of said plurality of plurality of matched segments.
  - 7. The method of claim 5, wherein said method of computing further comprising:
    - obtaining of an isosegmental line of said plurality of isosegmental lines;
      
      obtaining of a plurality of segment positional weights associated with said isosegmental line;
      
      computing of an error based on said plurality of segment positional weights and a distance measure; and
      
      making of said error a part of said plurality of errors.
  - 8. The method of claim 1, wherein said method of analyzing further comprising:
    - obtaining of a plurality of objects associated with said best matched script segment;
      
      obtaining of a plurality of persons associated with said best matched script segment;
      
      obtaining of a plurality of locations associated with said best matched script segment;
      
      obtaining of a plurality of scenes associated with said best matched script segment;
      
      obtaining of a plurality of dialogs associated with said best matched script segment;
      
      obtaining of a plurality of actions associated with said best matched script segment;
      
      obtaining of a plurality of key frames associated with multimedia scene;
      
      obtaining of a description associated with an object of said plurality of objects;
      
      obtaining of a key frame of said plurality of key frames;
      
      obtaining of a match factor based on said description and a coarse-grained annotation associated with said key frame;
      
      computing of a plurality of match factors associated with said plurality of key frames based on said description;
      
      selecting of said object based on said plurality of match factors and a pre-defined threshold;
      
      analyzing of said description to result in a plurality of subject-verb-object terms, wherein each of aid subject-verb-object terms describe a subject, an object, and a verb based on a sentence of said description; and
      
      making of said plurality of subject-verb-object entities a part of said fine grained annotation.
  - 9. The method of claim 1, wherein said method of determining of said plurality of homogeneous scenes further comprising:
    - obtaining of said plurality of multimedia scenes;
      
      obtaining of a homogeneity factor, wherein said homogeneity factor forms the basis of said plurality of homogeneous scenes;
      
      computing of a plurality of plurality of subject-verb-object terms based on said plurality of multimedia scenes, a plurality of fine-grained annotations, and said homogeneity factor, wherein each of said plurality of fine-grained annotations is associated with a multimedia scene of said plurality multimedia scenes;
      
      clustering of said plurality of plurality of subject-verb-object terms into a plurality of clusters based on a similarity measure associated with said homogeneity factor; and
      
      making of a plurality of multimedia scenes associated with a cluster of said plurality of clusters a part of said plurality of homogeneous scenes.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Satyam Computer Services Limited of Mayfair Centre
Original Assignee
Satyam Computer Services Limited of Mayfair Centre
Inventors
Kalyan, Kiran, Varadarajan, Sridhar, Gangadharpalli, Sridhar

Application Number

US12/576,668
Publication Number

US 20110087703A1
Time in Patent Office

Days
Field of Search
US Class Current

707/794
CPC Class Codes

G06F 16/71   Indexing; Data structures t...

G06F 16/7867   using information manually ...

H04N 21/23418   involving operations for an...

H04N 21/47202   for requesting content on d...

H04N 21/84   Generation or processing of...

H04N 21/8456   by decomposing the content ...

SYSTEM AND METHOD FOR DEEP ANNOTATION AND SEMANTIC INDEXING OF VIDEOS

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

35 Citations

9 Claims

Specification

Solutions

Use Cases

Quick Links

SYSTEM AND METHOD FOR DEEP ANNOTATION AND SEMANTIC INDEXING OF VIDEOS

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

35 Citations

9 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links