Location estimation of active speaker
First Claim
1. A method of performing an estimation of a location of an active speaker in real time, the method comprising:
- designating any one microphone of an array of microphones as a reference microphone;
storing a relative transfer function (RTF) for each microphone of the array of microphones other than the reference microphone associated with each potential location among potential locations as a set of stored RTFs;
obtaining a voice sample of the active speaker and obtaining a speaker RTF for each microphone of the array of microphones other than the reference microphone;
performing an RTF projection of the speaker RTF for each microphone on the set of stored RTFs; and
Determining, using a processor, one of the potential locations as the location of the active speaker based on the performing the RTF projection, wherein the obtaining the speaker RTF for each microphone of the array of microphones other than the reference microphone includes computing, for each of the potential locations, a ratio of an acoustic transfer function of the voice sample at the microphone to an acoustic transfer function of the voice sample at the reference microphone.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method to perform an estimation of a location of an active speaker in real time includes designating a microphone of an array of microphones as a reference microphone. The method includes storing a relative transfer function (RTF) for each microphone of the array of microphones other than the reference microphone associated with each potential location among potential locations as a set of stored RTFs, and obtaining a voice sample of the active speaker and obtaining a speaker RTF for each microphone of the array of microphones other than the reference microphone. The method also includes performing an RTF projection of the speaker RTF for each microphone on the set of stored RTFs, and determining one of the potential locations as the location of the active speaker based on the performing the RTF projection.
12 Citations
18 Claims
-
1. A method of performing an estimation of a location of an active speaker in real time, the method comprising:
-
designating any one microphone of an array of microphones as a reference microphone; storing a relative transfer function (RTF) for each microphone of the array of microphones other than the reference microphone associated with each potential location among potential locations as a set of stored RTFs; obtaining a voice sample of the active speaker and obtaining a speaker RTF for each microphone of the array of microphones other than the reference microphone; performing an RTF projection of the speaker RTF for each microphone on the set of stored RTFs; and Determining, using a processor, one of the potential locations as the location of the active speaker based on the performing the RTF projection, wherein the obtaining the speaker RTF for each microphone of the array of microphones other than the reference microphone includes computing, for each of the potential locations, a ratio of an acoustic transfer function of the voice sample at the microphone to an acoustic transfer function of the voice sample at the reference microphone. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system to estimate a location of an active speaker, the system comprising:
-
a memory device configured to store a relative transfer function (RTF) for each microphone of an array of microphones other than a reference microphone associated with each potential location among potential locations as a set of stored RTFs, wherein the reference microphone is any one of the array of microphones; and a processor configured to obtain a voice sample of the active speaker and obtain a speaker RTF for each microphone of the array of microphones other than the reference microphone, perform an RTF projection of the speaker RTF for each microphone on the set of stored RTFs, and determine one of the potential locations as the location of the active speaker based on the RTF projection, wherein the processor obtains the speaker RTF for each microphone of the array of microphones other than the reference microphone based on computing, for each of the potential locations, a ratio of an acoustic transfer function of the voice sample at the microphone to an acoustic transfer function of the voice sample at the reference microphone. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
Specification