Indexing multimedia communications

US 6,377,995 B2
Filed: 02/19/1998
Issued: 04/23/2002
Est. Priority Date: 02/19/1998
Status: Expired due to Term

First Claim

Patent Images

1. A method for indexing a multimedia communication, comprising:

receiving the multimedia communication, the multimedia communication including a plurality of multimedia data packets;

processing the plurality of multimedia data packets to identify distinguishing features and associating each of the plurality of multimedia data packets with one of a plurality of objects within the multimedia communication;

comparing the distinguishing features with representations of the plurality of objects stored within a database to create verified distinguishing features;

associating the verified distinguishing features with each one of a plurality of objects;

indexing the plurality of multimedia data packets based on the verified distinguishing features; and

rebroadcasting in near real-time, during the multimedia communication, the processed plurality of multimedia data packets, wherein when the object is a speaker participating in the multimedia communication, combination of the speaker'"'"'s audio speech patterns and video statistical sampling of a face of the speaker is a portion of the distinguishing features.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A network based platform uses face recognition, speech recognition, background change detection and key scene events to index multimedia communications. Before the multimedia communication begins, active participants register their speech and face models with a server. The process consists of creating a speech sample, capturing a sample image of the participant and storing the data in a database. The server provides an indexing function for the multimedia communication. During the multimedia communication, metadata including time stamping is retained along with the multimedia content. The time stamping information is used for synchronizing the multimedia elements. The multimedia communication is then processed through the server to identify the multimedia communication participants based on speaker and face recognition models. This allows the server to create an index table that becomes an index of the multimedia communication. In addition, through scene change detection and background recognition, certain backgrounds and key scene information can be used for indexing. Therefore, through this indexing apparatus and method, a specific participant can be recognized as speaking and the content that the participant discussed can also be used for indexing.

Citations

18 Claims

1. A method for indexing a multimedia communication, comprising:
- receiving the multimedia communication, the multimedia communication including a plurality of multimedia data packets;
  
  processing the plurality of multimedia data packets to identify distinguishing features and associating each of the plurality of multimedia data packets with one of a plurality of objects within the multimedia communication;
  
  comparing the distinguishing features with representations of the plurality of objects stored within a database to create verified distinguishing features;
  
  associating the verified distinguishing features with each one of a plurality of objects;
  
  indexing the plurality of multimedia data packets based on the verified distinguishing features; and
  
  rebroadcasting in near real-time, during the multimedia communication, the processed plurality of multimedia data packets, wherein when the object is a speaker participating in the multimedia communication, combination of the speaker'"'"'s audio speech patterns and video statistical sampling of a face of the speaker is a portion of the distinguishing features.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The method of claim 1, wherein the distinguishing features are based on at least one of audio data and video data of the multimedia communication.
  - 3. The method of claim 1, wherein the objects include at least one of a person, an animal, a plant, and an inanimate object.
  - 4. The method of claim 3, wherein a multimedia data packet of the plurality of multimedia data packets includes a payload having one of the audio and the video data that corresponds to an object, the associating step attaching a header identifier that identifies the object.
  - 5. The method of claim 4, wherein the speaker is associated with the multimedia data packet if the portion of the distinguishing features is included in the payload of the multimedia data packet.
  - 6. The method of claim 4, further comprising:
7. The method of claim 6, wherein the multimedia communication is a multicast multimedia communication, the rebroadcasting step including multicasting the processed plurality of multimedia data packets.
8. The method of claim 7, wherein a time stamp is provided to synchronize the audio and the video data.
9. The method of claim 8, further comprising storing the indexed plurality of multimedia data packets, wherein the indexed plurality of multimedia data packets can be searched to retrieve audio and video multimedia data packets corresponding to selected distinguishing features.
10. The method of claim 9, wherein the indexed plurality of multimedia data packets can be searched using key words.
11. The method of claim 10, wherein the multimedia communication is conducted using a local area network.
12. The method of claim 1, wherein the indexing and the processing steps are performed at a multicast network.

13. An apparatus for indexing a multimedia communication, comprising:
- a server that receives multimedia communication in multimedia data packets including audio, visual and data communications and identifies distinguishing features in the multimedia communication based on at least one of audio and video recognition and a source of the multimedia communications;
  
  a header function module connected to the server, the header function module entering metadata in a header segment corresponding to the multimedia data packets received by the server, the metadata being related to the distinguishing features;
  
  an index server that rebroadcasts in near real-time the multimedia communication;
  
  a storage device that stores the multimedia data packets, wherein the distinguishing features include audio, voice and video face patterns of participants in the multimedia communication.
- View Dependent Claims (14, 15, 16, 17)
- - 14. The apparatus of claim 13, wherein the metadata includes voice and face identifiers of the participants.
  - 15. The apparatus of claim 14, wherein the server identifies background changes in the video multimedia data packets and wherein the header function module enters second metadata in the header segment corresponding to the multimedia data packets having background changes, the second metadata including scene identifiers.
  - 16. The apparatus of claim 15, wherein the server is an multicast server.
  - 17. The apparatus of claim 16, wherein the server comprises audio, video and data bridges.

18. A method of identifying participants to a multimedia communication that is rebroadcast in near real-time, comprising:
- comparing audio speech patterns for each participant to speech models;
  
  comparing video face patterns for each participant to face models;
  
  determining an identity of a particular participant when both the audio speech patterns and the video face patterns match the speech and the face models for the particular participant; and
  
  creating an index of the participants based on identification of speech and face patterns of the participants, the index being used to segment the multimedia communication.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
AT&T Corporation (AT&T, Inc.)
Original Assignee
AT&T Corporation (AT&T, Inc.)
Inventors
Markowitz, Robert Edward, Rosen, Kenneth H., Shur, David Hilton, Agraharam, Sanjay, Winthrop, Joel A.
Primary Examiner(s)
Dinh, Dung C.
Assistant Examiner(s)
KUPSTAS, TOD A

Application Number

US09/025,940
Publication Number

US 20010042114A1
Time in Patent Office

1,524 Days
Field of Search

709/203, 709/231, 709/232, 709/217, 709/204, 386/52, 386/65, 386/96, 386/97, 386/106
US Class Current

709/231
CPC Class Codes

G06F 16/5838   using colour

G10L 17/00   Speaker identification or v...

H04L 65/1101   Session protocols

H04L 65/403   Arrangements for multi-part...

H04L 65/611   for multicast or broadcast ...

H04N 7/142   Constructional details of t...

H04N 7/147   Communication arrangements,...

H04N 7/152   Multipoint control units th...

Indexing multimedia communications

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Indexing multimedia communications

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links