×

Using automated content analysis for audio/video content consumption

  • US 7,640,272 B2
  • Filed: 12/07/2006
  • Issued: 12/29/2009
  • Est. Priority Date: 12/07/2006
  • Status: Active Grant
First Claim
Patent Images

1. An audio/video (A/V) processing system, comprising:

  • an automatic content analyzer receiving A/V content and analyzing the A/V content using speech recognition and natural language processing to generate speech metadata and natural language metadata corresponding to the A/V content, the speech metadata including speaker identification metadata identifying speakers in the A/V content and a location in the A/V content that the identified speakers are speaking and the natural language metadata including subject matter metadata describing subject matter of segments of the A/V content and where in the A/V content the subject matter is mentioned, wherein the automatic content analyzer comprises an audio analyzer generating speech metadata by recognizing words in the A/V content and aligning the words with the A/V content, wherein the words comprise a transcription of the A/V content;

    a player displaying a plurality of different metadata displays, the metadata displays displaying information based on the speech and the natural language metadata, the metadata displays including the speaker identification metadata, the subject matter metadata and the transcription, the metadata displays indicating where in the A/V content a speaker is speaking and where a subject matter is mentioned, wherein the player generates a user interface providing a user actuable input, actuable to select a speaker in the speaker identification metadata and either a word in the transcription, or a subject matter in the subject matter metadata, and to cause the player to begin playing the A/V content at a point in the A/V content that is aligned with the selected speaker and either the selected word or the selected subject matter, the user interface including a thumbnail section, a speaker indicator section below the thumbnail section, and a legend, the legend identifying each of the speakers in the A/V content, the thumbnail section including a plurality of different thumbnail photographs, each of the plurality of different thumbnail photographs representing a predominant speaker in one of the A/V content segments, the speaker indicator section identifying all the speakers that speak during the A/V content and approximately where in the A/V content each speaker speaks; and

    a computer processor, being a functional component of the A/V processing system, activated by the automatic content analyzer to facilitate analyzing of the A/V content.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×