SOUND SIGNAL PROCESSING APPARATUS, SOUND SIGNAL PROCESSING METHOD, AND PROGRAM
First Claim
1. A sound signal processing apparatus comprising:
- an observed signal analysis unit that receives as an observed signal a sound signal for a plurality of channels obtained by a sound signal input unit formed of a plurality of microphones placed at different positions and estimates a sound direction and a sound segment of a target sound which is sound to be extracted; and
a sound source extraction unit that receives the sound direction and sound segment of the target sound estimated by the observed signal analysis unit and extracts the sound signal for the target sound,wherein the observed signal analysis unit includesa short time Fourier transform unit that generates an observed signal in time-frequency domain by applying short time Fourier transform to the sound signal for the plurality of channels received; and
a direction/segment estimation unit that receives the observed signal generated by the short time Fourier transform unit and detects the sound direction and sound segment of the target sound, andwherein the sound source extraction unitexecutes iterative learning in which an extracting filter U′
is iteratively updated using a result of application of the extracting filter to the observed signal,prepares, as a function to be applied in the iterative learning, an objective function G(U′
) that assumes a local minimum or a local maximum when a value of the extracting filter U′
is a value optimal for extraction of the target sound, andcomputes a value of the extracting filter U′
which is in a neighborhood of a local minimum or a local maximum of the objective function G(U′
) using an auxiliary function method during the iterative learning, and applies the computed extracting filter to extract the sound signal for the target sound.
1 Assignment
0 Petitions
Accused Products
Abstract
A sound signal processing apparatus includes an observed signal analysis unit that receives as an observed signal a sound signal for channels obtained by a sound signal input unit formed of microphones and estimates a sound direction and a sound segment of a target sound which is sound to be extracted and a sound source extraction unit that receives the sound direction and sound segment of the target sound estimated by the observed signal analysis unit and extracts the sound signal for the target sound. The observed signal analysis unit includes a short time Fourier transform unit that generates an observed signal in time-frequency domain by applying short time Fourier transform to the sound signal for the channels received and a direction/segment estimation unit that receives the observed signal generated by the short time Fourier transform unit and detects the sound direction and sound segment of the target sound.
49 Citations
11 Claims
-
1. A sound signal processing apparatus comprising:
-
an observed signal analysis unit that receives as an observed signal a sound signal for a plurality of channels obtained by a sound signal input unit formed of a plurality of microphones placed at different positions and estimates a sound direction and a sound segment of a target sound which is sound to be extracted; and a sound source extraction unit that receives the sound direction and sound segment of the target sound estimated by the observed signal analysis unit and extracts the sound signal for the target sound, wherein the observed signal analysis unit includes a short time Fourier transform unit that generates an observed signal in time-frequency domain by applying short time Fourier transform to the sound signal for the plurality of channels received; and a direction/segment estimation unit that receives the observed signal generated by the short time Fourier transform unit and detects the sound direction and sound segment of the target sound, and wherein the sound source extraction unit executes iterative learning in which an extracting filter U′
is iteratively updated using a result of application of the extracting filter to the observed signal,prepares, as a function to be applied in the iterative learning, an objective function G(U′
) that assumes a local minimum or a local maximum when a value of the extracting filter U′
is a value optimal for extraction of the target sound, andcomputes a value of the extracting filter U′
which is in a neighborhood of a local minimum or a local maximum of the objective function G(U′
) using an auxiliary function method during the iterative learning, and applies the computed extracting filter to extract the sound signal for the target sound. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A sound signal processing method for execution in a sound signal processing apparatus, the method comprising:
-
performing, at an observed signal analysis unit, an observed signal analysis process in which a sound signal for a plurality of channels obtained by a sound signal input unit formed of a plurality of microphones disposed at different positions is received as an observed signal and a sound direction and a sound segment of a target sound which is sound to be extracted are estimated; and performing, at a sound source extraction unit, a sound source extraction process in which the sound direction and sound segment of the target sound estimated by the observed signal analysis unit are received and the sound signal for the target sound is extracted, wherein the observed signal analysis process includes executing a short time Fourier transform process for generating an observed signal in time-frequency domain by applying short time Fourier transform to the sound signal for the plurality of channels received; and executing a direction and segment estimation process for receiving the observed signal generated in the short time Fourier transform process and detecting the sound direction and sound segment of the target sound, and wherein the sound source extraction process includes executing iterative learning in which an extracting filter U′
is iteratively updated using a result of application of the extracting filter to the observed signal,preparing, as a function to be applied in the iterative learning, an objective function G(U′
) that assumes a local minimum or a local maximum when a value of the extracting filter U′
is a value optimal for extraction of the target sound, andcomputing a value of the extracting filter U′
which is in a neighborhood of a local minimum or a local maximum of the objective function G(U′
) using an auxiliary function method during the iterative learning, and applying the computed extracting filter to extract the sound signal for the target sound.
-
-
11. A program for causing a sound signal processing apparatus to execute sound signal processing, the program comprising:
-
causing an observed signal analysis unit to perform an observed signal analysis process for receiving as an observed signal a sound signal for a plurality of channels obtained by a sound signal input unit formed of a plurality of microphones placed at different positions and estimating a sound direction and a sound segment of a target sound which is sound to be extracted; and causing a sound source extraction unit to perform a sound source extraction process for receiving the sound direction and sound segment of the target sound estimated by the observed signal analysis unit and extracting the sound signal for the target sound, wherein the observed signal analysis process includes executing a short time Fourier transform process for generating an observed signal in time-frequency domain by applying short time Fourier transform to the sound signal for the plurality of channels received; and executing a direction and segment estimation process for receiving the observed signal generated in the short time Fourier transform process and detecting the sound direction and sound segment of the target sound, and wherein the sound source extraction process includes executing iterative learning in which an extracting filter U′
is iteratively updated using a result of application of the extracting filter to the observed signal, preparing, as a function to be applied in the iterative learning, an objective function G(U′
) that assumes a local minimum or a local maximum when a value of the extracting filter U′
is a value optimal for extraction of the target sound, andcomputing a value of the extracting filter U′
which is in a neighborhood of a local minimum or a local maximum of the objective function G(U′
) using an auxiliary function method during the iterative learning, and applying the computed extracting filter to extract the sound signal for the target sound.
-
Specification