Method and system for dialog enhancement

US 9,324,337 B2
Filed: 11/15/2010
Issued: 04/26/2016
Est. Priority Date: 11/17/2009
Status: Active Grant

First Claim

Patent Images

1. A method for enhancing dialog determined by an audio input signal, said method including the steps of:

(a) analyzing the input signal to generate filter control values without use of feedback; and

(b) providing at least one of the control values to a peaking filter, filtering a speech channel determined by the input signal in the peaking filter in a manner steered by said at least one of the control values to generate a dialog-enhanced speech channel, and attenuating non-speech channels determined by the input signal in ducking circuitry steered by at least a subset of the control values to generate attenuated non-speech channels, where the control values are distinct from the speech channel, the control values are distinct from the non-speech channels, the peaking filter is distinct from the ducking circuitry, the peaking filter is coupled and configured to filter the speech channel but not the non-speech channels, the ducking circuitry is coupled and configured to attenuate the non-speech channels but not the speech channel, the peaking filter is configured to emphasize frequency components of the speech channel in a frequency range critical to intelligibility of speech, relative to frequency components of the speech channel outside the frequency range, and said frequency range has a center frequency,wherein the step of attenuating the non-speech channels includes reducing gain application to the non-speech channels in response to a change in said at least a subset of the control values indicative of increase of power of the speech channel relative to combined power of the non-speech channels, and the step of filtering the speech channel includes applying more gain to the frequency components of the speech channel at the center frequency in response to a change in said at least one of the control values indicative of an increase in power of the speech channel relative to power of at least one of the non-speech channels.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and system for enhancing dialog determined by an audio input signal. In some embodiments the input signal is a stereo signal, and the system includes an analysis subsystem configured to analyze the stereo signal to generate filter control values, and a filtering subsystem including upmixing circuitry configured to upmix the input signal to generate a speech channel and non-speech channels and a peaking filter configured to filter the speech channel to enhance dialog while being steered by at least one of the control values. The filtering subsystem also includes ducking circuitry for attenuating the non-speech channels while being steered by at least some of the control values, and downmixing circuitry configured to combine outputs of the peaking filter and ducking circuitry to generate a filtered stereo output. In some embodiments, the system is configured to downmix a multichannel input signal to generate a downmixed stereo signal, an analysis subsystem is configured to analyze the downmixed stereo signal to generate filter control values, and a filtering subsystem is configured to generate a dialog-enhanced audio signal in response to the input signal while being steered by at least some of the filter control values. Preferably, the filter control values are generated without use of feedback including by generating power ratios (for pairs of speech and non-speech channels) and preferably also shaping in nonlinear fashion and scaling at least one of the power ratios.

56 Citations

View as Search Results

31 Claims

1. A method for enhancing dialog determined by an audio input signal, said method including the steps of:
- (a) analyzing the input signal to generate filter control values without use of feedback; and
  
  (b) providing at least one of the control values to a peaking filter, filtering a speech channel determined by the input signal in the peaking filter in a manner steered by said at least one of the control values to generate a dialog-enhanced speech channel, and attenuating non-speech channels determined by the input signal in ducking circuitry steered by at least a subset of the control values to generate attenuated non-speech channels, where the control values are distinct from the speech channel, the control values are distinct from the non-speech channels, the peaking filter is distinct from the ducking circuitry, the peaking filter is coupled and configured to filter the speech channel but not the non-speech channels, the ducking circuitry is coupled and configured to attenuate the non-speech channels but not the speech channel, the peaking filter is configured to emphasize frequency components of the speech channel in a frequency range critical to intelligibility of speech, relative to frequency components of the speech channel outside the frequency range, and said frequency range has a center frequency,wherein the step of attenuating the non-speech channels includes reducing gain application to the non-speech channels in response to a change in said at least a subset of the control values indicative of increase of power of the speech channel relative to combined power of the non-speech channels, and the step of filtering the speech channel includes applying more gain to the frequency components of the speech channel at the center frequency in response to a change in said at least one of the control values indicative of an increase in power of the speech channel relative to power of at least one of the non-speech channels.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. The method of claim 1, wherein step (a) includes a step of generating power ratios, including at least one ratio of power in a speech channel determined by the input signal to power in a non-speech channel determined by the input signal.
  - 3. The method of claim 2, wherein step (a) also includes a step of shaping in nonlinear fashion and scaling at least one of the power ratios.
  - 4. The method of claim 3, wherein said shaping in nonlinear fashion includes a step of exponentiating at least one value determined from at least one of the power ratios.
  - 5. The method of claim 1, wherein the input signal is a stereo signal, and also including the step of:
    - (c) combining the dialog-enhanced speech channel and the attenuated non-speech channels to generate a dialog-enhanced stereo output signal.
  - 6. The method of claim 5, wherein the stereo signal includes a left channel and a right channel, step (b) includes a step of upmixing the input signal to determine a first one of the non-speech channels, a second one of the non-speech channels, and the speech channel, and the step of upmixing includes a step of generating the speech channel in response to the input signal including by summing the left channel with the right channel.
  - 7. The method of claim 5, wherein step (a) includes a step of generating power ratios, including at least one ratio of power in a speech channel determined by the input signal to power in a non-speech channel determined by the input signal.
  - 8. The method of claim 7, wherein step (a) also includes a step of shaping in nonlinear fashion and scaling at least one of the power ratios.
  - 9. The method of claim 1, wherein the input signal is a multi-channel audio signal having more than two channels, and step (a) includes steps of:
    - (c) downmixing the input signal to generate a downmixed stereo signal; and
      
      (d) analyzing the downmixed stereo signal to generate the filter control values.
  - 10. The method of claim 9, wherein the dialog-enhanced speech channel and the attenuated non-speech channels generated in step (b) determine a dialog-enhanced multi-channel output signal.
  - 11. The method of claim 9, wherein step (a) includes a step of generating power ratios, including at least one ratio of power in a speech channel determined by the input signal to power in a non-speech channel determined by the input signal.
  - 12. The method of claim 11, wherein step (a) also includes a step of shaping in nonlinear fashion and scaling at least one of the power ratios.
  - 13. The method of claim 1, wherein the analyzing is time domain analysis on samples of the input signal, the filtering is time domain filtering of samples of the speech channel, and the attenuating is time domain attenuation of samples of the non-speech channels.

14. A system for enhancing dialog determined by an audio input signal, including:
- an analysis subsystem coupled and configured to analyze the input signal to generate filter control values without use of feedback; and
  
  a filtering subsystem coupled to the analysis subsystem and including a peaking filter and ducking circuitry, wherein the peaking filter is coupled to receive at least one of the control values and configured to filter a speech channel determined by the input signal, while being steered by said at least one of the control values, to generate a dialog-enhanced speech channel, and the ducking circuitry is configured to attenuate non-speech channels determined by the input signal, while being steered by at least a subset of the control values, to generate attenuated non-speech channels, where the control values are distinct from the speech channel, the control values are distinct from the non-speech channels, the peaking filter is distinct from the ducking circuitry, the peaking filter is coupled and configured to filter the speech channel but not the non-speech channels, including by emphasizing frequency components of the speech channel in a frequency range critical to intelligibility of speech relative to frequency components of the speech channel outside the frequency range, where said frequency range has a center frequency, the ducking circuitry is coupled and configured to attenuate the non-speech channels but not the speech channel, including by reducing gain application to the non-speech channels in response to a change in said at least a subset of the control values indicative of an increase, within limits, in power of the speech channel relative to combined power of the non-speech channels, and the peaking filter is configured to apply increased gain to frequency components of the speech channel having the center frequency in response to a change in said at least one of the control values indicating an increase in power of the speech channel relative to power of at least one of the non-speech channels.
- View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31)
- - 15. The system of claim 14, wherein the analysis subsystem is configured to generate power ratios in response to the input signal, the power ratios including at least one ratio of power in a speech channel determined by the input signal to power in a non-speech channel determined by the input signal.
  - 16. The system of claim 15, wherein the analysis subsystem is configured to shape in nonlinear fashion and scale at least one of the power ratios.
  - 17. The system of claim 15, wherein the analysis subsystem is configured to shape in nonlinear fashion at least one of the power ratios including by exponentiating at least one value determined from said at least one of the power ratios.
  - 18. The system of claim 14, wherein the input signal is a stereo signal, and the filtering subsystem includes:
    - upmixing circuitry configured to upmix the input signal to generate a speech channel and non-speech channels; and
      
      circuitry coupled and configured to analyze the speech channel and the non-speech channels to generate the filter control values.
  - 19. The system of claim 14, wherein the input signal is a stereo signal having a left channel and a right channel, and the filtering subsystem includes:
    - upmixing circuitry configured to upmix the input signal to generate a speech channel by summing the left input channel and the right input channel, and non-speech channels by subtracting the speech channel from each of the left input channel and the right input channel; and
      
      circuitry coupled and configured to analyze the speech channel and the non-speech channels to generate the filter control values.
  - 20. The system of claim 14, wherein the input signal is a stereo signal and the filtering subsystem is configured to combine the dialog-enhanced speech channel and the attenuated non-speech channels to generate a dialog-enhanced stereo output signal.
  - 21. The system of claim 20, wherein the analysis subsystem is configured to upmix the input signal to determine a first one of the non-speech channels, a second one of the non-speech channels, and the speech channel.
  - 22. The system of claim 20, wherein the analysis subsystem is configured to generate power ratios in response to the input signal, the power ratios including at least one ratio of power in a speech channel determined by the input signal to power in a non-speech channel determined by the input signal.
  - 23. The system of claim 22, wherein in the analysis subsystem is configured to shape in nonlinear fashion and scale at least one of the power ratios.
  - 24. The system of claim 14, wherein the input signal is a multi-channel audio signal having more than two channels, the system is configured to downmix the input signal to generate a downmixed stereo signal, and the analysis subsystem is configured to analyze the downmixed stereo signal to generate the filter control values.
  - 25. The system of claim 24, wherein the analysis subsystem is configured to downmix the input signal to generate the downmixed stereo signal, and the filtering subsystem is configured to assert the dialog-enhanced speech channel and the attenuated non-speech channels as a dialog-enhanced multi-channel output signal.
  - 26. The system of claim 24, wherein the analysis subsystem is configured to generate power ratios in response to the input signal, the power ratios including at least one ratio of power in a speech channel of the input signal to power in a non-speech channel of the input signal.
  - 27. The system of claim 26, wherein in the analysis subsystem is configured to shape in nonlinear fashion and scale at least one of the power ratios.
  - 28. The system of claim 14, wherein the peaking filter is a biquadratic filter whose response is determined by a current value of each said at least one of the control values.
  - 29. The system of claim 14, wherein said system is a data processing system configured to implement the analysis subsystem and the filtering subsystem.
  - 30. The system of claim 14, wherein said system is an audio digital signal processor.
  - 31. The system of claim 14, wherein said system is an audio digital signal processor including circuitry configured to implement the analysis subsystem and the filtering subsystem.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Dolby Laboratories Licensing Corporation (Dolby Laboratories Incorporated)
Original Assignee
Dolby Laboratories Licensing Corporation (Dolby Laboratories Incorporated)
Inventors
Brown, Charles Phillip
Primary Examiner(s)
Desir, Pierre-Louis
Assistant Examiner(s)
KOVACEK, DAVID M

Application Number

US12/945,967
Publication Number

US 20110119061A1
Time in Patent Office

1,989 Days
Field of Search

704200-2001, 704205-210, 704/211, 704225-228, 704231-231, 704/233, 704/246, 704/251, 704255-257, 704/258, 704/270, 704/276, 704/278, 704500-504, 704E15001-E1505, 704E21001-E21019, 381/1, 381/23.1, 381300-305, 381/310, 381/66, 381312-331, 381 711- 7114, 381 92- 949, 381 98-103
US Class Current

1/1
CPC Class Codes

G10L 15/28   Constructional details of s...

G10L 19/008   Multichannel audio signal c...

G10L 19/012   Comfort noise or silence co...

G10L 19/028   Noise substitution, i.e. su...

G10L 21/02   Speech enhancement, e.g. no...

G10L 21/0208   Noise filtering

G10L 21/0216   characterised by the method...

G10L 21/0232   Processing in the frequency...

Method and system for dialog enhancement

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

56 Citations

31 Claims

Specification

Solutions

Use Cases

Quick Links

Method and system for dialog enhancement

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

56 Citations

31 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links