Methods and apparatus for targeted sound detection and characterization
First Claim
1. A method for targeted sound detection using a microphone array having two or more microphones M0 . . . MM, each microphone being coupled to a plurality of filters, the filters being configured to filter input signals corresponding to sounds detected by the microphones thereby generating a filtered output, the method comprising:
- pre-calibrating a plurality sets of filter parameters for the plurality of filters to determine a corresponding plurality of pre-calibrated listening zones, wherein each set of filter parameters is selected to detect portions of the input signals corresponding to sounds originating within a given listening zone and filter out sounds originating outside the given listening zones; and
selecting a particular pre-calibrated listening zone at a runtime by applying to the plurality of filters sets of filter parameters corresponding to two or more different pre-calibrated listening zones, determining a value of an attenuation of the input signals for the two or more different pre-calibrated listening zones and selecting a particular zone of the two or more different pre-calibrated listening zones for which the attenuation is closest to an optimum value, and applying the filter parameters for the particular zone to the plurality of filters,whereby the microphone array may detect sounds originating within the particular listening zone and filters out sounds originating outside the particular listening zone;
wherein the one or more pre-calibrated listening zones include a plurality of different pre-calibrated listening zones, the method further comprising;
detecting a sound with the microphone array;
identifying a particular pre-calibrated listening zone containing a source of the sound;
characterizing the sound or the source of the sound; and
emphasizing or filtering out the sound depending on how the sound is characterized.
4 Assignments
0 Petitions
Accused Products
Abstract
Targeted sound detection methods and apparatus are disclosed. A microphone array has two or more microphones M0 . . . MM. Each microphone is coupled to a plurality of filters. The filters are configured to filter input signals corresponding to sounds detected by the microphones thereby generating a filtered output. One or more sets of filter parameters for the plurality of filters are pre-calibrated to determine one or more corresponding pre-calibrated listening zones. Each set of filter parameters is selected to detect portions of the input signals corresponding to sounds originating within a given listening zone and filter out sounds originating outside the given listening zone. A particular pre-calibrated listening zone is selected at a runtime by applying to the plurality of filters a set of filter coefficients corresponding to the particular pre-calibrated listening zone. As a result, the microphone array may detect sounds originating within the particular listening sector and filter out sounds originating outside the particular listening zone. Sounds are detected with the microphone array. A particular listening zone containing a source of the sound is identified. The sound or the source of the sound is characterized and the sound is emphasized or filtered out depending on how the sound is characterized.
-
Citations
54 Claims
-
1. A method for targeted sound detection using a microphone array having two or more microphones M0 . . . MM, each microphone being coupled to a plurality of filters, the filters being configured to filter input signals corresponding to sounds detected by the microphones thereby generating a filtered output, the method comprising:
-
pre-calibrating a plurality sets of filter parameters for the plurality of filters to determine a corresponding plurality of pre-calibrated listening zones, wherein each set of filter parameters is selected to detect portions of the input signals corresponding to sounds originating within a given listening zone and filter out sounds originating outside the given listening zones; and selecting a particular pre-calibrated listening zone at a runtime by applying to the plurality of filters sets of filter parameters corresponding to two or more different pre-calibrated listening zones, determining a value of an attenuation of the input signals for the two or more different pre-calibrated listening zones and selecting a particular zone of the two or more different pre-calibrated listening zones for which the attenuation is closest to an optimum value, and applying the filter parameters for the particular zone to the plurality of filters, whereby the microphone array may detect sounds originating within the particular listening zone and filters out sounds originating outside the particular listening zone; wherein the one or more pre-calibrated listening zones include a plurality of different pre-calibrated listening zones, the method further comprising; detecting a sound with the microphone array; identifying a particular pre-calibrated listening zone containing a source of the sound; characterizing the sound or the source of the sound; and emphasizing or filtering out the sound depending on how the sound is characterized.
-
-
2. The method of claim 1 wherein pre-calibrating a plurality of sets of the filter parameters includes using blind source separation to determine sets of finite impulse response (FIR) filter parameters.
-
3. The method of claim 1 wherein the one or more listening zones includes a listening zone that corresponds to a field of view of an image capture unit, whereby the microphone array may detect sounds originating within the field of view of the image capture unit and filter out sounds originating outside the field of view of the image capture unit.
-
4. The method of claim 1 wherein the plurality of pre-calibrated listening zones includes about 18 sectors, wherein each sector has an angular width of about 20 degrees, whereby the plurality of pre-calibrated sectors encompasses about 360 degrees surrounding the microphone array.
-
5. The method of claim 1 wherein selecting a particular pre-calibrated listening zone at a runtime includes selecting a pre-calibrated listening zone that contains a source of sound.
-
6. The method of claim 1 wherein selecting a particular pre-calibrated listening zone at a runtime includes selecting an initial zone of a plurality of listening zones;
determining whether a source of sound lies within the initial zone or on a particular side of the initial zone; and
, if the source of sound does not lie within the initial zone, selecting a different listening zone on the particular side of the initial zone, wherein the different listening zone is characterized by an attenuation of the input signals that is closest to an optimum value.
-
7. The method of claim 6 wherein determining whether a source of sound lies within the initial zone or on a particular side of the initial zone includes calculating from the input signals and the output signal an attenuation of the input signals and comparing the attenuation to the optimum value.
-
8. The method of claim 1 wherein selecting a particular pre-calibrated listening zone at a runtime includes determining whether, for a given listening zone, an attenuation of the input signals is below a threshold.
-
9. The method of claim 1 wherein selecting a particular pre-calibrated listening zone at a runtime includes selecting a pre-calibrated listening zone that contains a source of sound, the method further comprising robotically pointing an image capture unit toward the pre-calibrated listening zone that contains the source of sound.
-
10. The method of claim 1 wherein emphasizing or filtering out the sound depending on how the sound is characterized includes filtering out the sound if the sound or the source is associated with background noise.
-
11. The method of claim 1 wherein characterizing the sound or the source of the sound includes:
-
determining a frequency distribution for the sound; and comparing the frequency distribution against one or more acoustic models for known sounds or sources of sounds.
-
-
12. The method of claim 1 wherein characterizing the sound or the source of the sound includes analyzing the sound to determine whether or not the sound or source of sound has one or more predetermined characteristics.
-
13. The method of claim 12 further comprising generating at least one control signal for the purpose of controlling at least one aspect of an electronic device if it is determined that the sound does have one or more predetermined characteristics.
-
14. The method of claim 13 wherein the electronic device is a video game controller and the control signal causes the video game controller to execute game instructions in response to sounds from the source of sound.
-
15. The method of claim 1 wherein emphasizing or filtering out the sound depending on how the sound is characterized includes:
-
magnifying a noise disturbance of the audio signal relative to a remaining component of the audio signal; decreasing a sampling rate of the audio signal; applying an even order derivative to the audio signal having the decreased sampling rate to define a detection signal; and adjusting the noise disturbance of the audio signal according to a statistical average of the detection signal.
-
-
16. The method of claim 1 wherein the electronic device is a baby monitor.
-
17. The method of claim 1 wherein the electronic device is a video game unit having a joystick controller, the method further comprising generating at least one control signal for the purpose of controlling at least one aspect of the video game unit if it is determined that the sound or the source of sound has one or more predetermined characteristics;
- and
generating one or more additional control signals with the joystick controller.
- and
-
18. The method of claim 17 wherein generating one or more additional control signals with the joystick controller includes generating an optical signal with one or more light sources located on the joystick controller and receiving the optical signal with an image capture unit.
-
19. The method of claim 18 wherein receiving an optical signal includes capturing one or more images containing one or more light sources and analyzing the one or more images to determine a position or an orientation of the joystick controller and/or decode a telemetry signal from the joystick controller.
-
20. The method of claim 17, wherein generating one or more additional control signals with the joystick controller includes generating a position and/or orientation signal with an inertial sensor located on the joystick controller.
-
21. The method of claim 20, further comprising compensating for a drift in a position and/or orientation determined from the position and/or orientation signal.
-
22. The method of claim 21 wherein compensating for a drift includes setting a value of an initial position to a value of a current calculated position determined from the position and/or orientation signal.
-
23. The method of claim 21 wherein compensating for a drift includes capturing an image of the joystick controller with an image capture unit, analyzing the image to determine a position of the joystick controller and setting a current value of the position of the joystick controller to the position of the joystick controller determined from analyzing the image.
-
24. The method of claim 21, further comprising compensating for spurious data in a signal from the inertial sensor.
-
25. A targeted sound detection apparatus
a microphone array having two or more microphones M0 . . . MM; -
a plurality of filters coupled to each microphone, the filters being configured to filter input signals corresponding to sounds detected by the microphones and generate a filtered output; a processor coupled to the microphone array and the plurality of filters; a memory coupled to the processor; a plurality of sets of the filter parameters embodied in the memory, corresponding to a plurality of pre-calibrated listening zones, wherein each set of filter parameters is selected to detect portions of the input signals corresponding to sounds originating within a corresponding listening zone and filter out sounds originating outside the corresponding listening zone; the memory containing a set of processor executable instructions that, when executed, cause the apparatus to select a particular pre-calibrated listening zone at a runtime by applying to the plurality of filters sets of filter parameters corresponding to two or more different pre-calibrated listening zones, determining a value of an attenuation of the input signals for the two or more different pre-calibrated listening zones and selecting a particular zone of the two or more different pre-calibrated listening zones for which the attenuation is closest to an optimum value, and applying the filter parameters for the particular zone to the plurality of filters, whereby the apparatus may detect sounds originating within the particular pre-calibrated listening zone and filter out sounds originating outside the particular pre-calibrated listening zone; wherein the set of processor executable instructions includes one or more instructions which, when executed, cause the apparatus to; detect a sound with the microphone array; identify a particular listening zone containing a source of the sound; characterize the sound or the source of the sound; and emphasize or filter out the sound depending on how the sound is characterized.
-
-
26. The apparatus of claim 25 wherein the plurality of pre-calibrated listening zones includes about 18 sectors, wherein each sector has an angular width of about 20 degrees, whereby the plurality of pre-calibrated sectors encompasses about 360 degrees surrounding the microphone array.
-
27. The apparatus of claim 25 wherein the set of processor executable instructions includes one or more instructions which, when executed, cause the apparatus to select a pre-calibrated listening zone that contains a source of sound.
-
28. The apparatus of claim 25 wherein the set of processor executable instructions includes one or more instructions which, when executed, cause the apparatus to determine whether a source of sound lies within an initial listening zone or on a particular side of the initial listening zone;
- and, if the source of sound does not lie within the initial listening zone, select a different listening zone on the particular side of the initial listening zone, wherein the different listening zone is characterized by an attenuation of the input signals that is closest to an optimum value.
-
29. The apparatus of claim 28, wherein the one or more instructions which, when executed, cause the apparatus to determine whether a source of sound lies within the initial listening zone or on a particular side of the initial listening zone include one or more instructions which, when executed calculate from the input signals and the output signal an attenuation of the input signals and compare the attenuation to the optimum value.
-
30. The apparatus of claim 25 wherein the set of processor executable instructions includes one or more instructions which, when executed, cause the apparatus to determine whether, for a given listening zone, an attenuation of the input signals is below a threshold.
-
31. The apparatus of claim 25, further comprising an image capture unit coupled to the processor, wherein the one or more listening zones include a listening zone that corresponds to a field of view of the image capture unit.
-
32. The apparatus of claim 25, further comprising a image capture unit coupled to the processor, and one or more pointing actuators coupled to the processor, the pointing actuators being adapted to point the image capture unit in a viewing direction in response to signals generated by the processor, the memory containing a set of processor executable instructions that, when executed, cause the actuators to point the image capture unit in a direction of the particular pre-calibrated listening zone.
-
33. The apparatus of claim 25 wherein the set of processor executable includes instructions which, when executed, cause the apparatus to filter out the sound if the sound or the source is associated with background noise.
-
34. The apparatus of claim 25 wherein the instructions that cause the apparatus to characterize the sound or the source of the sound include instructions which, when executed, cause the apparatus to:
-
determine a frequency distribution for the sound; and compare the frequency distribution against one or more acoustic models for known sounds or sources of sounds.
-
-
35. The apparatus of claim 34 wherein the one or more acoustic models are stored in the memory.
-
36. The apparatus of claim 25 wherein the instructions that cause the apparatus to characterize the sound or the source of the sound include instructions which, when executed, cause the apparatus to analyze the sound to determine whether or not it has one or more predetermined characteristics.
-
37. The apparatus of claim 36 wherein the set of processor executable instructions further include one or more instructions which, when executed, cause the apparatus to generate at least one control signal may be generated for the purpose of controlling at least one aspect of the apparatus if it is determined that the sound does have one or more predetermined characteristics.
-
38. The apparatus of claim 37 wherein the apparatus is a video game controller and the control signal causes the video game controller to execute game instructions in response to sounds from the source of sound.
-
39. The apparatus of claim 25 wherein the apparatus is a baby monitor.
-
40. The apparatus of claim 25, further comprising a joystick controller coupled to the processor.
-
41. The apparatus of claim 40 wherein the joystick controller includes an inertial sensor coupled to the processor.
-
42. The apparatus of claim 41 wherein the inertial sensor includes an accelerometer or gyroscope.
-
43. The apparatus of claim 41 wherein signals from the inertial sensor and signals generated from the image capture unit from tracking one or more light sources mounted to the joystick controller are used as inputs to a game system.
-
44. The apparatus of claim 41 wherein the processor executable instructions include one or more instructions which, when executed compensate for spurious data in a signal from the inertial sensor.
-
45. The apparatus of claim 41 wherein the processor executable instructions include one or more instructions which, when executed compensate for a drift in a position and/or orientation determined from a position and/or orientation signal from the inertial sensor.
-
46. The apparatus of claim 45 wherein compensating for a drift includes setting a value of an initial position to a value of a current calculated position determined from the position and/or orientation signal.
-
47. The apparatus of claim 45 wherein compensating for a drift includes capturing an image of the joystick controller with an image capture unit, analyzing the image to determine a position of the joystick controller and setting a current value of the position of the joystick controller to the position of the joystick controller determined from analyzing the image.
-
48. The apparatus of claim 40 wherein the joystick controller includes one or more light sources, the apparatus further comprising an image capture unit, wherein the processor executable instructions including one or more instructions which, when executed cause the image capture unit to monitor a field of view in front of the image capture unit, identify the light source within the field of view;
- detect a change in light emitted from the light source; and
in response to detecting the change, triggering an input command to the processor.
- detect a change in light emitted from the light source; and
-
49. The apparatus of claim 40 wherein the joystick controller includes one or more light sources, the apparatus further comprising an image capture unit, wherein the processor executable instructions including one or more instructions which, when executed cause the image capture unit to capture one or more images containing the light sources and analyze the image to determine a position or an orientation of the joystick controller and/or decode a telemetry signal from the joystick controller.
-
50. The apparatus of claim 49 wherein the light sources include two or more light sources in a linear array.
-
51. The apparatus of claim 49 wherein the light sources include rectangular or arcuate configuration of a plurality of light sources.
-
52. The apparatus of claim 49 wherein the light sources are disposed on two or more different sides of the joystick controller to facilitate viewing of the light sources by the image capture unit.
-
53. The apparatus of claim 49, further comprising an inertial sensor mounted to the joystick controller, wherein a signal from the inertial sensor provides part of a tracking information input and signals generated from the image capture unit from tracking the one or more light sources provides another part of the tracking information input.
-
54. A computer-readable medium having embodied therein computer executable instructions for performing a method for targeted sound detection using a microphone array having two or more microphones M0 . . . MM, each microphone being coupled to a plurality of filters, the filters being configured to filter input signals corresponding to sounds detected by the microphones thereby generating a filtered output, the method comprising:
-
pre-calibrating a plurality of sets of filter parameters for the plurality of filters to determine a corresponding plurality of pre-calibrated listening zones, wherein each set of filter parameters is selected to detect portions of the input signals corresponding to sounds originating within a corresponding listening zone and filter out sounds originating outside the corresponding listening zone; and selecting a particular pre-calibrated listening zone at a runtime by applying to the plurality of filters sets of filter parameters corresponding to two or more different pre-calibrated listening zones, determining a value of an attenuation of the input signals for the two or more different pre-calibrated listening zones and selecting a particular zone of the two or more different pre-calibrated listening zones for which the attenuation is closest to an optimum value, and applying the filter parameters for the particular zone to the plurality of filters, whereby the microphone array may detect sounds originating within the particular listening zone and filters out sounds originating outside the particular listening zone; wherein the one or more pre-calibrated listening zones include a plurality of different listening zones, wherein the set of processor executable instructions includes one or more instructions which, when executed, cause the apparatus to; detect a sound with the microphone array; identify a particular listening zone containing a source of the sound; characterize the sound or the source of the sound; and emphasize or filter out the sound depending on how the sound is characterized.
-
Specification