Methods and apparatus for targeted sound detection and characterization

US 8,073,157 B2
Filed: 05/04/2006
Issued: 12/06/2011
Est. Priority Date: 08/27/2003
Status: Active Grant

First Claim

Patent Images

1. A method for targeted sound detection using a microphone array having two or more microphones M₀. . . M_M, each microphone being coupled to a plurality of filters, the filters being configured to filter input signals corresponding to sounds detected by the microphones thereby generating a filtered output, the method comprising:

pre-calibrating a plurality sets of filter parameters for the plurality of filters to determine a corresponding plurality of pre-calibrated listening zones, wherein each set of filter parameters is selected to detect portions of the input signals corresponding to sounds originating within a given listening zone and filter out sounds originating outside the given listening zones; and

selecting a particular pre-calibrated listening zone at a runtime by applying to the plurality of filters sets of filter parameters corresponding to two or more different pre-calibrated listening zones, determining a value of an attenuation of the input signals for the two or more different pre-calibrated listening zones and selecting a particular zone of the two or more different pre-calibrated listening zones for which the attenuation is closest to an optimum value, and applying the filter parameters for the particular zone to the plurality of filters,whereby the microphone array may detect sounds originating within the particular listening zone and filters out sounds originating outside the particular listening zone;

wherein the one or more pre-calibrated listening zones include a plurality of different pre-calibrated listening zones, the method further comprising;

detecting a sound with the microphone array;

identifying a particular pre-calibrated listening zone containing a source of the sound;

characterizing the sound or the source of the sound; and

emphasizing or filtering out the sound depending on how the sound is characterized.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Targeted sound detection methods and apparatus are disclosed. A microphone array has two or more microphones M₀. . . M_M. Each microphone is coupled to a plurality of filters. The filters are configured to filter input signals corresponding to sounds detected by the microphones thereby generating a filtered output. One or more sets of filter parameters for the plurality of filters are pre-calibrated to determine one or more corresponding pre-calibrated listening zones. Each set of filter parameters is selected to detect portions of the input signals corresponding to sounds originating within a given listening zone and filter out sounds originating outside the given listening zone. A particular pre-calibrated listening zone is selected at a runtime by applying to the plurality of filters a set of filter coefficients corresponding to the particular pre-calibrated listening zone. As a result, the microphone array may detect sounds originating within the particular listening sector and filter out sounds originating outside the particular listening zone. Sounds are detected with the microphone array. A particular listening zone containing a source of the sound is identified. The sound or the source of the sound is characterized and the sound is emphasized or filtered out depending on how the sound is characterized.

Citations

54 Claims

1. A method for targeted sound detection using a microphone array having two or more microphones M₀. . . M_M, each microphone being coupled to a plurality of filters, the filters being configured to filter input signals corresponding to sounds detected by the microphones thereby generating a filtered output, the method comprising:
- pre-calibrating a plurality sets of filter parameters for the plurality of filters to determine a corresponding plurality of pre-calibrated listening zones, wherein each set of filter parameters is selected to detect portions of the input signals corresponding to sounds originating within a given listening zone and filter out sounds originating outside the given listening zones; and
  
  selecting a particular pre-calibrated listening zone at a runtime by applying to the plurality of filters sets of filter parameters corresponding to two or more different pre-calibrated listening zones, determining a value of an attenuation of the input signals for the two or more different pre-calibrated listening zones and selecting a particular zone of the two or more different pre-calibrated listening zones for which the attenuation is closest to an optimum value, and applying the filter parameters for the particular zone to the plurality of filters,whereby the microphone array may detect sounds originating within the particular listening zone and filters out sounds originating outside the particular listening zone;
  
  wherein the one or more pre-calibrated listening zones include a plurality of different pre-calibrated listening zones, the method further comprising;
  
  detecting a sound with the microphone array;
  
  identifying a particular pre-calibrated listening zone containing a source of the sound;
  
  characterizing the sound or the source of the sound; and
  
  emphasizing or filtering out the sound depending on how the sound is characterized.

2. The method of claim 1 wherein pre-calibrating a plurality of sets of the filter parameters includes using blind source separation to determine sets of finite impulse response (FIR) filter parameters.

3. The method of claim 1 wherein the one or more listening zones includes a listening zone that corresponds to a field of view of an image capture unit, whereby the microphone array may detect sounds originating within the field of view of the image capture unit and filter out sounds originating outside the field of view of the image capture unit.

4. The method of claim 1 wherein the plurality of pre-calibrated listening zones includes about 18 sectors, wherein each sector has an angular width of about 20 degrees, whereby the plurality of pre-calibrated sectors encompasses about 360 degrees surrounding the microphone array.

5. The method of claim 1 wherein selecting a particular pre-calibrated listening zone at a runtime includes selecting a pre-calibrated listening zone that contains a source of sound.

6. The method of claim 1 wherein selecting a particular pre-calibrated listening zone at a runtime includes selecting an initial zone of a plurality of listening zones;
- determining whether a source of sound lies within the initial zone or on a particular side of the initial zone; and
  
  , if the source of sound does not lie within the initial zone, selecting a different listening zone on the particular side of the initial zone, wherein the different listening zone is characterized by an attenuation of the input signals that is closest to an optimum value.

7. The method of claim 6 wherein determining whether a source of sound lies within the initial zone or on a particular side of the initial zone includes calculating from the input signals and the output signal an attenuation of the input signals and comparing the attenuation to the optimum value.

8. The method of claim 1 wherein selecting a particular pre-calibrated listening zone at a runtime includes determining whether, for a given listening zone, an attenuation of the input signals is below a threshold.

9. The method of claim 1 wherein selecting a particular pre-calibrated listening zone at a runtime includes selecting a pre-calibrated listening zone that contains a source of sound, the method further comprising robotically pointing an image capture unit toward the pre-calibrated listening zone that contains the source of sound.

10. The method of claim 1 wherein emphasizing or filtering out the sound depending on how the sound is characterized includes filtering out the sound if the sound or the source is associated with background noise.

11. The method of claim 1 wherein characterizing the sound or the source of the sound includes:
- determining a frequency distribution for the sound; and
  
  comparing the frequency distribution against one or more acoustic models for known sounds or sources of sounds.

12. The method of claim 1 wherein characterizing the sound or the source of the sound includes analyzing the sound to determine whether or not the sound or source of sound has one or more predetermined characteristics.

13. The method of claim 12 further comprising generating at least one control signal for the purpose of controlling at least one aspect of an electronic device if it is determined that the sound does have one or more predetermined characteristics.

14. The method of claim 13 wherein the electronic device is a video game controller and the control signal causes the video game controller to execute game instructions in response to sounds from the source of sound.

15. The method of claim 1 wherein emphasizing or filtering out the sound depending on how the sound is characterized includes:
- magnifying a noise disturbance of the audio signal relative to a remaining component of the audio signal;
  
  decreasing a sampling rate of the audio signal;
  
  applying an even order derivative to the audio signal having the decreased sampling rate to define a detection signal; and
  
  adjusting the noise disturbance of the audio signal according to a statistical average of the detection signal.

16. The method of claim 1 wherein the electronic device is a baby monitor.

17. The method of claim 1 wherein the electronic device is a video game unit having a joystick controller, the method further comprising generating at least one control signal for the purpose of controlling at least one aspect of the video game unit if it is determined that the sound or the source of sound has one or more predetermined characteristics;
- andgenerating one or more additional control signals with the joystick controller.

18. The method of claim 17 wherein generating one or more additional control signals with the joystick controller includes generating an optical signal with one or more light sources located on the joystick controller and receiving the optical signal with an image capture unit.

19. The method of claim 18 wherein receiving an optical signal includes capturing one or more images containing one or more light sources and analyzing the one or more images to determine a position or an orientation of the joystick controller and/or decode a telemetry signal from the joystick controller.

20. The method of claim 17, wherein generating one or more additional control signals with the joystick controller includes generating a position and/or orientation signal with an inertial sensor located on the joystick controller.

21. The method of claim 20, further comprising compensating for a drift in a position and/or orientation determined from the position and/or orientation signal.

22. The method of claim 21 wherein compensating for a drift includes setting a value of an initial position to a value of a current calculated position determined from the position and/or orientation signal.

23. The method of claim 21 wherein compensating for a drift includes capturing an image of the joystick controller with an image capture unit, analyzing the image to determine a position of the joystick controller and setting a current value of the position of the joystick controller to the position of the joystick controller determined from analyzing the image.

24. The method of claim 21, further comprising compensating for spurious data in a signal from the inertial sensor.

25. A targeted sound detection apparatusa microphone array having two or more microphones M₀. . . M_M;
- a plurality of filters coupled to each microphone, the filters being configured to filter input signals corresponding to sounds detected by the microphones and generate a filtered output;
  
  a processor coupled to the microphone array and the plurality of filters;
  
  a memory coupled to the processor;
  
  a plurality of sets of the filter parameters embodied in the memory, corresponding to a plurality of pre-calibrated listening zones, wherein each set of filter parameters is selected to detect portions of the input signals corresponding to sounds originating within a corresponding listening zone and filter out sounds originating outside the corresponding listening zone;
  
  the memory containing a set of processor executable instructions that, when executed, cause the apparatus to select a particular pre-calibrated listening zone at a runtime by applying to the plurality of filters sets of filter parameters corresponding to two or more different pre-calibrated listening zones, determining a value of an attenuation of the input signals for the two or more different pre-calibrated listening zones and selecting a particular zone of the two or more different pre-calibrated listening zones for which the attenuation is closest to an optimum value, and applying the filter parameters for the particular zone to the plurality of filters,whereby the apparatus may detect sounds originating within the particular pre-calibrated listening zone and filter out sounds originating outside the particular pre-calibrated listening zone;
  
  wherein the set of processor executable instructions includes one or more instructions which, when executed, cause the apparatus to;
  
  detect a sound with the microphone array;
  
  identify a particular listening zone containing a source of the sound;
  
  characterize the sound or the source of the sound; and
  
  emphasize or filter out the sound depending on how the sound is characterized.

26. The apparatus of claim 25 wherein the plurality of pre-calibrated listening zones includes about 18 sectors, wherein each sector has an angular width of about 20 degrees, whereby the plurality of pre-calibrated sectors encompasses about 360 degrees surrounding the microphone array.

27. The apparatus of claim 25 wherein the set of processor executable instructions includes one or more instructions which, when executed, cause the apparatus to select a pre-calibrated listening zone that contains a source of sound.

28. The apparatus of claim 25 wherein the set of processor executable instructions includes one or more instructions which, when executed, cause the apparatus to determine whether a source of sound lies within an initial listening zone or on a particular side of the initial listening zone;
- and, if the source of sound does not lie within the initial listening zone, select a different listening zone on the particular side of the initial listening zone, wherein the different listening zone is characterized by an attenuation of the input signals that is closest to an optimum value.

29. The apparatus of claim 28, wherein the one or more instructions which, when executed, cause the apparatus to determine whether a source of sound lies within the initial listening zone or on a particular side of the initial listening zone include one or more instructions which, when executed calculate from the input signals and the output signal an attenuation of the input signals and compare the attenuation to the optimum value.

30. The apparatus of claim 25 wherein the set of processor executable instructions includes one or more instructions which, when executed, cause the apparatus to determine whether, for a given listening zone, an attenuation of the input signals is below a threshold.

31. The apparatus of claim 25, further comprising an image capture unit coupled to the processor, wherein the one or more listening zones include a listening zone that corresponds to a field of view of the image capture unit.

32. The apparatus of claim 25, further comprising a image capture unit coupled to the processor, and one or more pointing actuators coupled to the processor, the pointing actuators being adapted to point the image capture unit in a viewing direction in response to signals generated by the processor, the memory containing a set of processor executable instructions that, when executed, cause the actuators to point the image capture unit in a direction of the particular pre-calibrated listening zone.

33. The apparatus of claim 25 wherein the set of processor executable includes instructions which, when executed, cause the apparatus to filter out the sound if the sound or the source is associated with background noise.

34. The apparatus of claim 25 wherein the instructions that cause the apparatus to characterize the sound or the source of the sound include instructions which, when executed, cause the apparatus to:
- determine a frequency distribution for the sound; and
  
  compare the frequency distribution against one or more acoustic models for known sounds or sources of sounds.

35. The apparatus of claim 34 wherein the one or more acoustic models are stored in the memory.

36. The apparatus of claim 25 wherein the instructions that cause the apparatus to characterize the sound or the source of the sound include instructions which, when executed, cause the apparatus to analyze the sound to determine whether or not it has one or more predetermined characteristics.

37. The apparatus of claim 36 wherein the set of processor executable instructions further include one or more instructions which, when executed, cause the apparatus to generate at least one control signal may be generated for the purpose of controlling at least one aspect of the apparatus if it is determined that the sound does have one or more predetermined characteristics.

38. The apparatus of claim 37 wherein the apparatus is a video game controller and the control signal causes the video game controller to execute game instructions in response to sounds from the source of sound.

39. The apparatus of claim 25 wherein the apparatus is a baby monitor.

40. The apparatus of claim 25, further comprising a joystick controller coupled to the processor.

41. The apparatus of claim 40 wherein the joystick controller includes an inertial sensor coupled to the processor.

42. The apparatus of claim 41 wherein the inertial sensor includes an accelerometer or gyroscope.

43. The apparatus of claim 41 wherein signals from the inertial sensor and signals generated from the image capture unit from tracking one or more light sources mounted to the joystick controller are used as inputs to a game system.

44. The apparatus of claim 41 wherein the processor executable instructions include one or more instructions which, when executed compensate for spurious data in a signal from the inertial sensor.

45. The apparatus of claim 41 wherein the processor executable instructions include one or more instructions which, when executed compensate for a drift in a position and/or orientation determined from a position and/or orientation signal from the inertial sensor.

46. The apparatus of claim 45 wherein compensating for a drift includes setting a value of an initial position to a value of a current calculated position determined from the position and/or orientation signal.

47. The apparatus of claim 45 wherein compensating for a drift includes capturing an image of the joystick controller with an image capture unit, analyzing the image to determine a position of the joystick controller and setting a current value of the position of the joystick controller to the position of the joystick controller determined from analyzing the image.

48. The apparatus of claim 40 wherein the joystick controller includes one or more light sources, the apparatus further comprising an image capture unit, wherein the processor executable instructions including one or more instructions which, when executed cause the image capture unit to monitor a field of view in front of the image capture unit, identify the light source within the field of view;
- detect a change in light emitted from the light source; and
  
  in response to detecting the change, triggering an input command to the processor.

49. The apparatus of claim 40 wherein the joystick controller includes one or more light sources, the apparatus further comprising an image capture unit, wherein the processor executable instructions including one or more instructions which, when executed cause the image capture unit to capture one or more images containing the light sources and analyze the image to determine a position or an orientation of the joystick controller and/or decode a telemetry signal from the joystick controller.

50. The apparatus of claim 49 wherein the light sources include two or more light sources in a linear array.

51. The apparatus of claim 49 wherein the light sources include rectangular or arcuate configuration of a plurality of light sources.

52. The apparatus of claim 49 wherein the light sources are disposed on two or more different sides of the joystick controller to facilitate viewing of the light sources by the image capture unit.

53. The apparatus of claim 49, further comprising an inertial sensor mounted to the joystick controller, wherein a signal from the inertial sensor provides part of a tracking information input and signals generated from the image capture unit from tracking the one or more light sources provides another part of the tracking information input.

54. A computer-readable medium having embodied therein computer executable instructions for performing a method for targeted sound detection using a microphone array having two or more microphones M₀. . . M_M, each microphone being coupled to a plurality of filters, the filters being configured to filter input signals corresponding to sounds detected by the microphones thereby generating a filtered output, the method comprising:
- pre-calibrating a plurality of sets of filter parameters for the plurality of filters to determine a corresponding plurality of pre-calibrated listening zones, wherein each set of filter parameters is selected to detect portions of the input signals corresponding to sounds originating within a corresponding listening zone and filter out sounds originating outside the corresponding listening zone; and
  
  selecting a particular pre-calibrated listening zone at a runtime by applying to the plurality of filters sets of filter parameters corresponding to two or more different pre-calibrated listening zones, determining a value of an attenuation of the input signals for the two or more different pre-calibrated listening zones and selecting a particular zone of the two or more different pre-calibrated listening zones for which the attenuation is closest to an optimum value, and applying the filter parameters for the particular zone to the plurality of filters,whereby the microphone array may detect sounds originating within the particular listening zone and filters out sounds originating outside the particular listening zone;
  
  wherein the one or more pre-calibrated listening zones include a plurality of different listening zones, wherein the set of processor executable instructions includes one or more instructions which, when executed, cause the apparatus to;
  
  detect a sound with the microphone array;
  
  identify a particular listening zone containing a source of the sound;
  
  characterize the sound or the source of the sound; and
  
  emphasize or filter out the sound depending on how the sound is characterized.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sony Interactive Entertainment Inc. (Sony Group Corp.)
Original Assignee
Sony Computer Entertainment Incorporated (Sony Group Corp.)
Inventors
Mao, Xiadong, Marks, Richard L., Zalewski, Gary M.
Primary Examiner(s)
Faulk, Devona E

Application Number

US11/381,724
Publication Number

US 20060233389A1
Time in Patent Office

2,042 Days
Field of Search

381/92, 381/11, 381/122, 381/91, 381 56- 69, 381/306, 381/310, 381/111, 381/71.11, 381/71.13, 348/11, 348/14.08, 348/15, 348/14.8
US Class Current

381/92
CPC Class Codes

H04R 1/406   microphones

H04R 2201/403   Linear arrays of transducers

H04R 2430/23   Direction finding using a s...

H04R 29/005   Microphone arrays

H04R 3/005   for combining the signals o...

Methods and apparatus for targeted sound detection and characterization

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

54 Claims

Specification

Solutions

Use Cases

Quick Links

Methods and apparatus for targeted sound detection and characterization

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

54 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links