Electronic device directional audio-video capture
First Claim
1. An apparatus comprising:
- a housing;
a processor in the housing; and
an audio-visual source tracking system connected to the processor, wherein the audio-visual source tracking system comprises a first video camera and an array of microphones, wherein the first video camera and the array of microphones are attached to the housing, wherein at least a portion of the first video camera and at least a portion of the array of microphones are mounted inside the housing, wherein the audio-visual source tracking system is configured to receive video information from the first video camera, and wherein the audio-visual source tracking system is configured to capture audio information from the array of microphones at least partially in response to the video information, wherein the audio-visual source tracking system is configured to adjust and direct the sensitivity of the array of microphones during an active audio/visual speech call at least partially in response to the video information, wherein the audio-visual source tracking system is configured to estimate a depth of the video information with the first camera by analyzing a face size in the video information, and wherein the apparatus is a multi-function portable electronic device;
wherein the array of microphones are proximate the first video camera;
wherein the audio-visual source tracking system is configured to monitor positions of detected faces in a video; and
wherein the audio-visual source tracking system is further configured to detect an active speaker from the detected faces by successively adjusting and directing the sensitivity of the array of microphones towards the active speaker'"'"'s face such that if an audio level exceeds a threshold, a corresponding face is considered to be the active speaker'"'"'s face.
2 Assignments
0 Petitions
Accused Products
Abstract
Disclosed herein is an apparatus. The apparatus includes a housing, electronic circuitry, and an audio-visual source tracking system. The electronic circuitry is in the housing. The audio-visual source tracking system includes a first video camera and an array of microphones. The first video camera and the array of microphones are attached to the housing. The audio-visual source tracking system is configured to receive video information from the first video camera. The audio-visual source tracking system is configured to capture audio information from the array of microphones at least partially in response to the video information. The audio-visual source tracking system might include a second video camera that is attached to the housing, wherein the first and second video cameras together estimate the beam orientation of the array of microphones.
24 Citations
20 Claims
-
1. An apparatus comprising:
-
a housing; a processor in the housing; and an audio-visual source tracking system connected to the processor, wherein the audio-visual source tracking system comprises a first video camera and an array of microphones, wherein the first video camera and the array of microphones are attached to the housing, wherein at least a portion of the first video camera and at least a portion of the array of microphones are mounted inside the housing, wherein the audio-visual source tracking system is configured to receive video information from the first video camera, and wherein the audio-visual source tracking system is configured to capture audio information from the array of microphones at least partially in response to the video information, wherein the audio-visual source tracking system is configured to adjust and direct the sensitivity of the array of microphones during an active audio/visual speech call at least partially in response to the video information, wherein the audio-visual source tracking system is configured to estimate a depth of the video information with the first camera by analyzing a face size in the video information, and wherein the apparatus is a multi-function portable electronic device; wherein the array of microphones are proximate the first video camera; wherein the audio-visual source tracking system is configured to monitor positions of detected faces in a video; and wherein the audio-visual source tracking system is further configured to detect an active speaker from the detected faces by successively adjusting and directing the sensitivity of the array of microphones towards the active speaker'"'"'s face such that if an audio level exceeds a threshold, a corresponding face is considered to be the active speaker'"'"'s face. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method comprising:
-
capturing a first image with a camera of an audio-visual source tracking system of an apparatus; determining a direction of a portion of the first image with respect to an array of microphones of the apparatus; and controlling a predetermined characteristic of the array of microphones based at least partially on the direction of the portion of the first image, wherein the controlling of the predetermined characteristic of the array of microphones further comprises adjusting and directing the sensitivity of the array of microphones during an active audio/visual speech call at least partially in response to video information, wherein the audio-visual source tracking system is configured to estimate a depth of the video information with the first camera by analyzing a face size in the video information, wherein the apparatus is a multi-function portable electronic device, and wherein at least a portion of the camera and at least a portion of the array of microphones are mounted inside a housing of the multi-function portable electronic device; wherein the array of microphones are proximate the first video camera; wherein the audio-visual source tracking system is configured to monitor positions of detected faces in a video; and wherein the audio-visual source tracking system is further configured to detect an active speaker from the detected faces by successively adjusting and directing the sensitivity of the array of microphones towards the active speaker'"'"'s face such that if an audio level exceeds a threshold, a corresponding face is considered to be the active speaker'"'"'s face. - View Dependent Claims (15, 16, 17, 18, 19, 20)
-
Specification