Low bit rate audio-visual communication system having integrated perceptual speech and video coding
First Claim
1. A method of encoding a first audio and a first video signal in a transmitter of an audio-visual communication system associated with a first party for transmission, utilizing a predetermined number of available bits, to a receiver associated with a second party, said receiver decoding said first audio and video signals for presentation to said second party, said encoding method comprising the steps of:
- detecting whether said second party is talking; and
allocating a minimal number of said predetermined number of available bits for encoding said first audio signal if said detecting step determines that said second party is talking.
6 Assignments
0 Petitions
Accused Products
Abstract
Disclosed is a low bit rate audio and video communication system which employs an integrated encoding system that dynamically allocates available bits among the audio and video signals to be encoded based on the content of the audio and video information and the manner in which the audio and video information will be perceived by a viewer. A dynamic bit allocation and encoding process will evaluate the current content of the audio and video information and allocate the available bits among the audio and video signals to be encoded. In addition, an appropriate audio encoding technique is dynamically selected based on the current content of the audio signal. A face location detection subroutine will detect and model the location of faces in each video frame, in order that the facial regions may be more accurately encoded than other portions of the video frame. A lip motion detection subroutine will detect the location and movement of the lips of a person present in a video scene, in order to determine when a person is speaking and to encode the lip regions more accurately. The audio and video signals generated by a second party to a communication are monitored to determine if the second party is paying attention to the audio and video information transmitted by the first party to the communication.
45 Citations
11 Claims
-
1. A method of encoding a first audio and a first video signal in a transmitter of an audio-visual communication system associated with a first party for transmission, utilizing a predetermined number of available bits, to a receiver associated with a second party, said receiver decoding said first audio and video signals for presentation to said second party, said encoding method comprising the steps of:
-
detecting whether said second party is talking; and allocating a minimal number of said predetermined number of available bits for encoding said first audio signal if said detecting step determines that said second party is talking. - View Dependent Claims (3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
2. A method of encoding a first audio and a first video signal in a transmitter of an audio-visual communication system associated with a first party for transmission to a receiver associated with a second party, said receiver decoding said first audio and video signals for presentation to said second party, said encoding method comprising the steps of:
-
detecting whether said second party is talking; and allocating a minimal number of said available bits for encoding said first audio signal if said detecting step determines that said second party is talking; said detecting step comprising the steps of; analyzing a second audio signal associated with said second party to determine if said second party is generating audio activity; analyzing a second video signal which includes a representation of said second party, said second video signal including a representation of the face and lips of said second party, said analysis of said second video signal determining if said lips of said second party are moving; and concluding said second party is talking if said analyzing steps determine that there is audio activity and said lips of said second party are moving.
-
Specification