Method for controlling a speaker array to provide spatialized, localized, and binaural virtual surround sound

US 9,578,440 B2
Filed: 11/15/2011
Issued: 02/21/2017
Est. Priority Date: 11/15/2010
Status: Active Grant

First Claim

Patent Images

1. A method for producing multi-dimensional sound from a speaker array, comprising:

receiving a plurality of audio signals from a plurality of sources;

filtering each audio signal through each of a left Head-Related Transfer Function (HRTF) and a right HRTF to generate HRTF-filtered left and HTRF-filtered right audio signals, wherein the left HRTF is calculated based on an angle at which the plurality of audio signals will be transmitted to a left ear of a user, and wherein the right HRTF is calculated based on an angle at which the plurality of audio signals will be transmitted to a right ear of a user;

filtering each of the HRTF-filtered left and HRTF-filtered right audio signals with a Psychoacoustic Bandwidth Extension Processor (PBEP);

merging the PBEP HRTF-filtered left audio signals into a left total binaural signal;

merging the PBEP HRTF-filtered right audio signals into a right total binaural signal;

filtering the left total binaural signal through a set of left spatialization filters, wherein a separate left spatialization filter is provided for each speaker in the speaker array;

filtering the right total binaural signal through a set of right spatialization filters, wherein a separate right spatialization filter is provided for each speaker in the speaker array;

summing the filtered left total binaural signal and filtered right total binaural signal for each respective speaker into a speaker signal;

feeding the speaker signal to the respective speaker in the speaker array; and

transmitting the speaker signal through the respective speaker to the user.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A signal processing method and system are provided for delivering spatialized sound using highly optimized inverse filters to deliver narrow localized beams of sound from the included speaker array. The inventive method can be used to provide private listening areas in a public space and provide spatialization of source material for individual users to create a virtual 3D audio effect. In a binaural mode, a speaker array provides two targeted beams aimed towards the primary user'"'"'s ears—one discrete beam for the left ear and one discrete beam for the right ear.

47 Citations

View as Search Results

41 Claims

1. A method for producing multi-dimensional sound from a speaker array, comprising:
- receiving a plurality of audio signals from a plurality of sources;
  
  filtering each audio signal through each of a left Head-Related Transfer Function (HRTF) and a right HRTF to generate HRTF-filtered left and HTRF-filtered right audio signals, wherein the left HRTF is calculated based on an angle at which the plurality of audio signals will be transmitted to a left ear of a user, and wherein the right HRTF is calculated based on an angle at which the plurality of audio signals will be transmitted to a right ear of a user;
  
  filtering each of the HRTF-filtered left and HRTF-filtered right audio signals with a Psychoacoustic Bandwidth Extension Processor (PBEP);
  
  merging the PBEP HRTF-filtered left audio signals into a left total binaural signal;
  
  merging the PBEP HRTF-filtered right audio signals into a right total binaural signal;
  
  filtering the left total binaural signal through a set of left spatialization filters, wherein a separate left spatialization filter is provided for each speaker in the speaker array;
  
  filtering the right total binaural signal through a set of right spatialization filters, wherein a separate right spatialization filter is provided for each speaker in the speaker array;
  
  summing the filtered left total binaural signal and filtered right total binaural signal for each respective speaker into a speaker signal;
  
  feeding the speaker signal to the respective speaker in the speaker array; and
  
  transmitting the speaker signal through the respective speaker to the user.
- View Dependent Claims (2, 3, 4, 5, 6, 21, 22)
- - 2. The method of claim 1, wherein the left HRTF and right HRTF are computed in real-time using a binaural processor.
  - 3. The method of claim 1, wherein the spatialization filters are finite impulse response (FIR) filters.
  - 4. The method of claim 3, wherein two control points are used to compute the FIR filters, and wherein the distance between the control points is approximately 0.1 meters (m) to approximately 0.3 m.
  - 5. The method of claim 1, further comprising adapting the spatialization filters in real-time based on a change in the location of the user.
  - 6. The method of claim 1, further comprising matching the loudness of the PBEP-filtered audio signals using a Dynamic Range Compressor and Expander (DRCE).
  - 21. The speaker array system of claim 3, wherein each FIR filter is optimized in a frequency domain by minimizing a cost function J for each frequency according to the relationship J=E+β
    - V, where E is a performance error, and β
      
      V is an effort penalty in which β
      
      is a regularization parameter for weighting effort term V.
  - 22. The method of claim 1 further comprising, prior to feeding the speaker signal to the speaker array, combining the speaker signal with a beamforming speaker signal, wherein mixture of the signals is controlled for privacy or enhanced intelligibility.

7. A method for producing a localized sound from a speaker array comprising a plurality of speakers, comprising:
- receiving at least one audio signal;
  
  pre-filtering the at least one audio signal with a Psychoacoustic Bandwidth Extension Processor (PBEP);
  
  filtering the at least one audio signal through a set of finite impulse response (FIR) filters, wherein a separate FIR filter is provided for each speaker in the speaker array, wherein each FIR filter has filter coefficients a(f) optimized in a frequency domain by minimizing a cost function J for each frequency f according to the relationship
  J(f)=∥
  
  H(f)a(f)−
  
  p(f )∥
  
  ²+β
  
  ∥
  
  a(f)∥
  
  ²,where H(f) is a M×
  
  N matrix of electro-acoustical transfer functions computed for N speakers and M virtual control points, p(f) is a vector representing a target sound field at the M virtual control points as a function of frequency, ∥
  
  . . . ∥
  
  indicates L²norm of a vector, and β
  
  is a regularization parameter;
  
  summing the filtered audio signals for each respective speaker into a speaker signal;
  
  transmitting each speaker signal to the respective speaker in the speaker array; and
  
  delivering each speaker signal to one or more regions of space occupied by one or more users.
- View Dependent Claims (8, 9, 10, 11, 12, 23)
- - 8. The method of claim 7, further comprising delivering at least one secondary audio signal to an area around the one or more users which masks the speaker signal in the area not occupied by the one or more users.
  - 9. The method of claim 8, wherein the masking signal is a musical signal.
  - 10. The method of claim 8, further comprising dynamically adjusting the amplitude and time of the masking signals.
  - 11. The method of claim 7, further comprising adapting the FIR filters in real-time based on a change in the location of the one or more users.
  - 12. The method of claim 7, further comprising matching the loudness of the pre-filtered audio signals using a Dynamic Range Compressor and Expander (DRCE).
  - 23. The speaker array system of claim 7, wherein each FIR filter is optimized in a frequency domain by minimizing a cost function J for each frequency according to the relationship J=E+β
    - V, where E is a performance error, and β
      
      is an effort penalty in which β
      
      is a regularization parameter for weighting effort term V.

13. A speaker array system for producing localized sound, comprising:
- an input which receives a plurality of audio signals from at least one source;
  
  a processor in communication with a non-transitory computer-readable medium containing instructions configured for causing the processor to determine whether the plurality of audio signals should be processed by a binaural processing system or a beamforming processing system; and
  
  a speaker array comprising a plurality of loudspeakers;
  
  wherein the binaural processing system comprises;
  
  at least one filter which filters each audio signal through a left Head-Related Transfer Function (HRTF) and a right HRTF, wherein the left HRTF is calculated based on an angle at which the plurality of audio signals will be transmitted to a left ear of a user; and
  
  wherein the right HRTF is calculated based on an angle at which the plurality of audio signals will be transmitted to a right ear of a user;
  
  a left combiner which combines all of the audio signals from the left HRTF into a left total binaural signal;
  
  a right combiner which combines all of the audio signals from the right HRTF into a right total binaural signal;
  
  at least one left spatialization filter which filters the left total binaural signal, wherein a separate left spatialization filter is provided for each loudspeaker in a speaker array;
  
  at least one right spatialization filter which filters the right total binaural signal, wherein a separate right spatialization filter is provided for each loudspeaker in the speaker array; and
  
  a binaural combiner which sums the filtered left total binaural signal and filtered right total binaural signal into a binaural speaker signal for each respective loudspeaker and transmits each binaural speaker signal to the respective loudspeaker;
  
  wherein the beamforming processing system comprises;
  
  a plurality of beamforming spatialization filters which filters each audio signal, wherein a separate spatialization filter is provided for each loudspeaker in the speaker array; and
  
  a beamforming combiner which sums the filtered audio signals for each respective loudspeaker into a beamforming speaker signal and transmits each beamforming speaker signal to the respective speaker in the speaker array;
  
  wherein the speaker array delivers the respective binaural speaker signal or the beamforming speaker signal through the plurality of loudspeakers to one or more users.
- View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
- - 14. The speaker array system of claim 13, wherein the plurality of audio signals can be processed by the beamforming processing system and the binaural processing system before being delivered to the one or more users through the plurality of loudspeakers.
  - 15. The speaker array system of claim 13, further comprising a user tracking unit which adjusts the binaural processing system and beamforming processing system based on a change in a location of the one or more users.
  - 16. The speaker array system of claim 13, wherein the binaural processing system further comprises a binaural processor which computes the left HRTF and right HRTF in real-time.
  - 17. The speaker array system of claim 13, further comprising a left Psychoacoustic Bandwidth Extension Processor (PBEP) disposed between the left HRTF and the left combiner and a right PBEP disposed between the right HRTF and the right combiner.
  - 18. The speaker array system of claim 17, further comprising a left Dynamic Range Compressor and Expander (DRCE) disposed between the left PBEP and the left combiner and a right DRCE disposed between the right HRTF and the right combiner.
  - 19. The speaker array of claim 13 further comprising a combiner configured to sum the binaural speaker signal and the beamforming speaker signal prior to delivery to the plurality of loudspeakers, wherein mixture of the signals is controlled for privacy or enhanced intelligibility.
  - 20. The speaker array system of claim 13, wherein each at least one left spatialization filter, at least one right spatialization filter, and beamforming spatialization filter is a finite impulse response (FIR) filter optimized in a frequency domain by minimizing a cost function J for each frequency according to the relationship J=E+β
    - V, where E is a performance error, and β
      
      V is an effort penalty in β
      
      is a regularization parameter for weighting effort term V.

24. A method for producing multidimensional sound from a speaker array, comprising:
- receiving a plurality of audio signals, each audio signal comprising a plurality of frequencies, from a plurality of sources;
  
  filtering each audio signal through each of a left Head-Related Transfer Function (HRTF) and a right HRTF to generate HRTF-filtered left and HTRF-filtered right audio signals, wherein the left HRTF is calculated based on an angle at which the plurality of audio signals will be transmitted to a left ear of a user, and wherein the right HRTF is calculated based on an angle at which the plurality of audio signals will be transmitted to a right ear of a user;
  
  merging the HRTF-filtered left audio signals into a left total binaural signal;
  
  merging the HRTF-filtered right audio signals into a right total binaural signal;
  
  filtering the left total binaural signal through a set of left finite impulse response (FIR) filters, wherein a separate left FIR filter is provided for each speaker in the speaker array;
  
  filtering the right total binaural signal through a set of right FIR filters, wherein a separate right FIR filter is provided for each speaker in the speaker array;
  
  wherein each FIR filter has filter coefficients optimized in a frequency domain by minimizing a cost function J for each frequency according to the relationship J(f)=∥
  
  H(f) a(f)−
  
  p(f)∥
  
  ²+∥
  
  a(f) ∥
  
  ², where H(f) is a M×
  
  N matrix of electro-acoustical transfer functions computed for N speakers and M virtual control points, p(f) is a vector representing a target sound field at the M virtual control points as a function of frequency, ∥
  
  . . . ∥
  
  indicates L²norm of a vector, and β
  
  is a regularization parameter;
  
  summing the filtered left total binaural signal and filtered right total binaural signal for each respective speaker into a speaker signal;
  
  feeding the speaker signal to the respective speaker in the speaker array; and
  
  transmitting the speaker signal through the respective speaker to the user.
- View Dependent Claims (25, 26, 27, 28, 29, 30, 31, 32)
- - 25. The method of claim 24, wherein the left HRTF and right HRTF are computed in real-time using a binaural processor.
  - 26. The method of claim 24, wherein two control points are used to compute the FIR filters, and wherein the distance between the control points is approximately 0.1 meters (m) to approximately 0.3 m.
  - 27. The method of claim 24, further comprising adapting the FIR filters in real-time based on a change in the location of the user.
  - 28. The method of claim 24, further comprising pre-filtering the plurality of audio signals with a Psychoacoustic Bandwidth Extension Processor (PBEP).
  - 29. The method of claim 24, further comprising matching the loudness of the pre-filtered audio signals using a Dynamic Range Compressor and Expander (DRCE).
  - 30. The method of claim 24, further comprising, prior to feeding the speaker signal to the speaker array, combining the speaker signal with a beamforming speaker signal, wherein mixture of the signals is controlled for privacy or enhanced intelligibility.
  - 31. The method of claim 24, wherein M is two and the virtual control points comprise a listener'"'"'s ears.
  - 32. The method of claim 24, wherein M is a multiple of two and the virtual control points comprise multiple listener'"'"'s ears.

33. A method for producing a localized sound from a speaker array comprising a plurality of speakers, comprising:
- receiving at least one audio signal comprising a plurality of frequencies;
  
  filtering the at least one audio signal through a set of finite impulse response (FIR) filters, wherein a separate FIR filter is provided for each speaker in the speaker array, wherein each FIR filter has filter coefficients a(f) optimized in a frequency domain by minimizing a cost function J for each frequency f according to the relationship
  J(f)=∥
  
  H(f)a(f)−
  
  p(f)∥
  
  ²+β
  
  ∥
  
  a(f)∥
  
  ²,where H(f) is a M×
  
  N matrix of electro-acoustical transfer functions computed for N speakers and M virtual control points, p(f) is a vector representing a target sound field at the M virtual control points as a function of frequency, ∥
  
  . . . ∥
  
  indicates L²norm of a vector, and β
  
  is a regularization parameter;
  
  summing the filtered audio signals for each respective speaker into a speaker signal;
  
  transmitting each speaker signal to the respective speaker in the speaker array; and
  
  delivering each speaker signal to one or more regions of space occupied by one or more users.
- View Dependent Claims (34, 35, 36, 37, 38, 39, 40, 41)
- - 34. The method of claim 33, further comprising delivering at least one secondary audio signal to an area around the one or more users which masks the speaker signal in the area not occupied by the one or more users.
  - 35. The method of claim 34, wherein the at least one secondary audio signal is a musical signal.
  - 36. The method of claim 35, further comprising dynamically adjusting the amplitude and time of the at least one secondary audio signal.
  - 37. The method of claim 33, further comprising adapting the FIR filters in real-time based on a change in the location of the one or more users.
  - 38. The method of claim 33, further comprising pre-filtering the plurality of audio signals with a Psychoacoustic Bandwidth Extension Processor (PBEP).
  - 39. The method of claim 33, further comprising matching the loudness of the pre-filtered audio signals using a Dynamic Range Compressor and Expander (DRCE).
  - 40. The method of claim 33, wherein M is two and the virtual control points comprise a listener'"'"'s ears.
  - 41. The method of claim 33, wherein M is a multiple of two and the virtual control points comprise multiple listener'"'"'s ears.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Regents of the University of California (University of California), University of Southampton
Original Assignee
Regents of the University of California (University of California), University of Southampton
Inventors
Yamada, Toshiro, Fazi, Filippo M., Otto, Peter, Kamdar, Suketu
Primary Examiner(s)
GAY, SONIA L

Application Number

US13/885,392
Publication Number

US 20140064526A1
Time in Patent Office

1,925 Days
Field of Search

381/300
US Class Current

1/1
CPC Class Codes

H04R 1/403   loud-speakers

H04R 2203/12   Beamforming aspects for ste...

H04R 5/04   Circuit arrangements, e.g. ...

H04S 2420/01   Enhancing the perception of...

H04S 2420/13   Application of wave-field s...

H04S 5/00   Pseudo-stereo systems, e.g....

H04S 7/303   Tracking of listener positi...

Method for controlling a speaker array to provide spatialized, localized, and binaural virtual surround sound

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

47 Citations

41 Claims

Specification

Use Cases

Quick Links

Others

Method for controlling a speaker array to provide spatialized, localized, and binaural virtual surround sound

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

47 Citations

41 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others