Apparatus for detecting position of object capable of simultaneously detecting plural objects and detection method therefor
First Claim
1. An apparatus for collecting voices, comprising:
- image input means for inputting an image obtained by photographing a plurality of persons;
person position detection means for processing image information supplied from said image input means to obtain the positions of a plurality of persons;
person position selection means for selecting the position of at least one person which is a subject to be processed from the positions of the plural persons detected by said person position detection means;
voice input means for individually inputting voices through a plurality of channels;
filter constraint setting means for making one of person positions to be an object position among at least person positions selected by said person position selection means, and setting constraint for raising a sensitivity with respect to a voice from the object position as compared with other sensitivities with respect to voices from person positions which have not been selected;
input signal generating means for generating an input signal which can be observed on the assumption that a sound source signal, which has been arbitrarily generated, is disposed at a person position except for the object position;
filter determining means for determining a filter for lowering the sensitivity with respect to voices from person positions except for the object portion under the constraint set by said filter constraint setting means and in accordance with the input signal generated by said input signal generating means; and
voice extracting means for subjecting the voice input by said voice input means to a filter process by using a filter coefficient obtained by said filter determining means so as to extract the voice.
1 Assignment
0 Petitions
Accused Products
Abstract
An apparatus for detecting a position of an object, including a signal output portion for generating a predetermined signal to radiate the signal into a space toward an arbitrary object, a signal input portion having a plurality of sensors for individually receiving signals reflected from the object, an impulse response calculating portion for obtaining an impulse response for each sensor in accordance with the signal radiated from the signal output portion and the signals received by the plural sensors, and an object position estimating portion for calculating the weight of a virtual position determined at an arbitrary point on the assumption that the signal radiated to the space by the signal output portion is reflected by the virtual position in such a manner that transmission time required for the signal to reach the signal input portion is measured and the components of each impulse response calculated in accordance with the transmission time are used to calculate the weight and calculating the weight while shifting the virtual position to estimate a virtual position, at which the weight exceeds a predetermined threshold value, to be the position of the object.
86 Citations
3 Claims
-
1. An apparatus for collecting voices, comprising:
-
image input means for inputting an image obtained by photographing a plurality of persons; person position detection means for processing image information supplied from said image input means to obtain the positions of a plurality of persons; person position selection means for selecting the position of at least one person which is a subject to be processed from the positions of the plural persons detected by said person position detection means; voice input means for individually inputting voices through a plurality of channels; filter constraint setting means for making one of person positions to be an object position among at least person positions selected by said person position selection means, and setting constraint for raising a sensitivity with respect to a voice from the object position as compared with other sensitivities with respect to voices from person positions which have not been selected; input signal generating means for generating an input signal which can be observed on the assumption that a sound source signal, which has been arbitrarily generated, is disposed at a person position except for the object position; filter determining means for determining a filter for lowering the sensitivity with respect to voices from person positions except for the object portion under the constraint set by said filter constraint setting means and in accordance with the input signal generated by said input signal generating means; and voice extracting means for subjecting the voice input by said voice input means to a filter process by using a filter coefficient obtained by said filter determining means so as to extract the voice. - View Dependent Claims (2)
-
-
3. A method of collecting voices, comprising the steps of:
-
inputting an image obtained by photographing at least portions of a plurality of persons; individually inputting voices through a plurality of channels; processing image information supplied in said step of inputting the image to obtain the positions of a plurality of persons; selecting the position of at least one person which is a subject to be processed from the positions of the plural persons detected in said step of detecting the person position; determining a filter coefficient in accordance with a first signal which can be obtained owning to an observation performed on the assumption that a sound source signal, which has been generated arbitrarily, is disposed at the position of the person selected by said person position selection means and a second signal which is generated from the sound source signal in accordance with a mode selected from two modes consisting of a mode in which other sensitivities with respect to all voices from the selected person positions are simultaneously raised as compared with the sensitivities with respect to voices from person positions which have not been selected and a mode in which the sensitivity of only a voice from a specified object position is raised as compared with the sensitivities with respect to voices from person positions which have not been selected; and extracting only the voices corresponding to the selected mode from voices input by said voice input means, said extraction being performed by using the filter coefficient determined in said step of determining the filter coefficient.
-
Specification