Noise-robust speech coding mode classification

US 8,990,074 B2
Filed: 04/10/2012
Issued: 03/24/2015
Est. Priority Date: 05/24/2011
Status: Active Grant

First Claim

Patent Images

1. A method of noise-robust speech classification, comprising:

inputting classification parameters to a speech classifier from external components;

generating, in the speech classifier, internal classification parameters from at least one of the input classification parameters;

setting a Normalized Auto-correlation Coefficient Function threshold, wherein setting the Normalized Auto-correlation Coefficient Function threshold comprises;

increasing a first voicing threshold for classifying a current frame as unvoiced when a signal-to-noise ratio (SNR) fails to exceed a first SNR threshold, wherein the first voicing threshold is not adjusted if the SNR is above the first SNR threshold, andincreasing an energy threshold for classifying the current frame as unvoiced when the noise estimate exceeds a noise estimate threshold, wherein the energy threshold is not adjusted if the noise estimate is below the noise estimate threshold; and

determining a speech mode classification based on a the first voicing threshold and the energy threshold.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method of noise-robust speech classification is disclosed. Classification parameters are input to a speech classifier from external components. Internal classification parameters are generated in the speech classifier from at least one of the input parameters. A Normalized Auto-correlation Coefficient Function threshold is set. A parameter analyzer is selected according to a signal environment. A speech mode classification is determined based on a noise estimate of multiple frames of input speech.

25 Citations

View as Search Results

43 Claims

1. A method of noise-robust speech classification, comprising:
- inputting classification parameters to a speech classifier from external components;
  
  generating, in the speech classifier, internal classification parameters from at least one of the input classification parameters;
  
  setting a Normalized Auto-correlation Coefficient Function threshold, wherein setting the Normalized Auto-correlation Coefficient Function threshold comprises;
  
  increasing a first voicing threshold for classifying a current frame as unvoiced when a signal-to-noise ratio (SNR) fails to exceed a first SNR threshold, wherein the first voicing threshold is not adjusted if the SNR is above the first SNR threshold, andincreasing an energy threshold for classifying the current frame as unvoiced when the noise estimate exceeds a noise estimate threshold, wherein the energy threshold is not adjusted if the noise estimate is below the noise estimate threshold; and
  
  determining a speech mode classification based on a the first voicing threshold and the energy threshold.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32)
- - 2. The method of claim 1, wherein setting the Normalized Auto-correlation Coefficient Function threshold further comprises decreasing a second voicing threshold for classifying a current frame as voiced when the SNR fails to exceed a second SNR threshold, wherein the second voicing threshold is not adjusted if the SNR is above the second SNR threshold.
  - 3. The method of claim 1, wherein the input parameters comprise a noise suppressed speech signal.
  - 4. The method of claim 1, wherein the input parameters comprise voice activity information.
  - 5. The method of claim 1, wherein the input parameters comprise Linear Prediction reflection coefficients.
  - 6. The method of claim 1, wherein the input parameters comprise Normalized Auto-correlation Coefficient Function information.
  - 7. The method of claim 1, wherein the input parameters comprise Normalized Auto-correlation Coefficient Function at pitch information.
  - 8. The method of claim 7, wherein the Normalized Auto-correlation Coefficient Function at pitch information is an array of values.
  - 9. The method of claim 1, wherein the internal parameters comprise a zero crossing rate parameter.
  - 10. The method of claim 1, wherein the internal parameters comprise a current frame energy parameter.
  - 11. The method of claim 1, wherein the internal parameters comprise a look ahead frame energy parameter.
  - 12. The method of claim 1, wherein the internal parameters comprise a band energy ratio parameter.
  - 13. The method of claim 1, wherein the internal parameters comprise a three frame averaged voiced energy parameter.
  - 14. The method of claim 1, wherein the internal parameters comprise a previous three frame average voiced energy parameter.
  - 15. The method of claim 1, wherein the internal parameters comprise a current frame energy to previous three frame average voiced energy ratio parameter.
  - 16. The method of claim 1, wherein the internal parameters comprise a current frame energy to three frame average voiced energy parameter.
  - 17. The method of claim 1, wherein the internal parameters comprise a maximum sub-frame energy index parameter.
  - 18. The method of claim 1, wherein the setting a Normalized Auto-correlation Coefficient Function threshold comprises comparing the noise estimate to a pre-determined Signal to a noise estimate threshold.
  - 19. The method of claim 1, wherein the parameter analyzer applies the parameters to a state machine.
  - 20. The method of claim 19, wherein the state machine comprises a state for each speech classification mode.
  - 21. The method of claim 1, wherein the speech mode classification comprises a Transient mode.
  - 22. The method of claim 1, wherein the speech mode classification comprises an Up-Transient mode.
  - 23. The method of claim 1, wherein the speech mode classification comprises a Down-Transient mode.
  - 24. The method of claim 1, wherein the speech mode classification comprises a Voiced mode.
  - 25. The method of claim 1, wherein the speech mode classification comprises an Unvoiced mode.
  - 26. The method of claim 1, wherein the speech mode classification comprises a Silence mode.
  - 27. The method of claim 1, further comprising updating at least one parameter.
  - 28. The method of claim 27, wherein the updated parameter comprises a Normalized Auto-correlation Coefficient Function at pitch parameter.
  - 29. The method of claim 27, wherein the updated parameter comprises a three frame averaged voiced energy parameter.
  - 30. The method of claim 27, wherein the updated parameter comprises a look ahead frame energy parameter.
  - 31. The method of claim 27, wherein the updated parameter comprises a previous three frame average voiced energy parameter.
  - 32. The method of claim 27, wherein the updated parameter comprises a voice activity detection parameter.

33. An apparatus for noise-robust speech classification, comprising:
- a processor;
  
  memory in electronic communication with the processor;
  
  instructions stored in the memory, the instructions being executable by the processor to;
  
  input classification parameters to a speech classifier from external components;
  
  generate, in the speech classifier, internal classification parameters from at least one of the input classification parameters;
  
  set a Normalized Auto-correlation Coefficient Function threshold, wherein the instructions executable to set the Normalized Auto-correlation Coefficient Function threshold further comprise instructions executable to;
  
  increase a first voicing threshold for classifying a current frame as unvoiced when a signal-to-noise ratio (SNR) fails to exceed a first SNR threshold, wherein the first voicing threshold is not adjusted if the SNR is above the first SNR threshold, andincrease an energy threshold for classifying the current frame as unvoiced when the noise estimate exceeds a noise estimate threshold, wherein the energy threshold is not adjusted if the noise estimate is below the noise estimate threshold; and
  
  determine a speech mode classification based on the first voicing threshold and the energy threshold.
- View Dependent Claims (34, 35, 36, 37, 38, 39)
- - 34. The apparatus of claim 33, wherein the instructions executable to set the Normalized Auto-correlation Coefficient Function threshold further comprise instructions executable to decrease a second voicing threshold for classifying a current frame as voiced when the SNR fails to exceed a second SNR threshold, wherein the second voicing threshold is not adjusted if the SNR is above the second SNR threshold.
  - 35. The apparatus of claim 33, wherein the input parameters comprise one or more of a noise suppressed speech signal, voice activity information, Linear Prediction reflection coefficients, Normalized Auto-correlation Coefficient Function information and Normalized Auto-correlation Coefficient Function at pitch information.
  - 36. The apparatus of claim 35, wherein the Normalized Auto-correlation Coefficient Function at pitch information is an array of values.
  - 37. The apparatus of claim 35, wherein the internal parameters comprise one or more of a zero crossing rate parameter, a current frame energy parameter, a look ahead frame energy parameter, a band energy ratio parameter, a three frame averaged voiced energy parameter, a previous three frame average voiced energy parameter, a current frame energy to previous three frame average voiced energy ratio parameter, a current frame energy to three frame average voiced energy parameter and a maximum sub-frame energy index parameter.
  - 38. The apparatus of claim 33, further comprising instructions executable to update at least one parameter.
  - 39. The apparatus of claim 38, wherein the updated parameter comprises one or more of a Normalized Auto-correlation Coefficient Function at pitch parameter, a three frame averaged voiced energy parameter, a look ahead frame energy parameter, a previous three frame average voiced energy parameter and a voice activity detection parameter.

40. An apparatus for noise-robust speech classification, comprising:
- means for inputting classification parameters to a speech classifier from external components;
  
  means for generating, in the speech classifier, internal classification parameters from at least one of the input classification parameters;
  
  means for setting a Normalized Auto-correlation Coefficient Function threshold, wherein the means for setting the Normalized Auto-correlation Coefficient Function threshold comprise;
  
  means for increasing a first voicing threshold for classifying a current frame as unvoiced when a signal-to-noise ratio (SNR) fails to exceed a first SNR threshold, wherein the first voicing threshold is not adjusted if the SNR is above the first SNR threshold, andmeans for increasing an energy threshold for classifying the current frame as unvoiced when the noise estimate exceeds a noise estimate threshold, wherein the energy threshold is not adjusted if the noise estimate is below the noise estimate threshold; and
  
  means for determining a speech mode classification based on the first voicing threshold and the energy threshold.
- View Dependent Claims (41)
- - 41. The apparatus of claim 40, wherein the means for setting the Normalized Auto-correlation Coefficient Function threshold further comprise means for decreasing a second voicing threshold for classifying a current frame as voiced when the SNR fails to exceed a second SNR threshold, wherein the second voicing threshold is not adjusted if the SNR is above the second SNR threshold.

42. A computer-program product for noise-robust speech classification, the computer-program product comprising a non-transitory computer-readable medium having instructions thereon, the instructions, comprising:
- code for inputting classification parameters to a speech classifier from external components;
  
  code for generating, in the speech classifier, internal classification parameters from at least one of the input classification parameters;
  
  code for setting a Normalized Auto-correlation Coefficient Function threshold, wherein the code for setting the Normalized Auto-correlation Coefficient Function threshold comprises;
  
  code for increasing a first voicing threshold for classifying a current frame as unvoiced when the noise estimate exceeds a noise estimate threshold a signal-to-noise ratio (SNR) fails to exceed a first SNR threshold, wherein the first voicing threshold is not adjusted if the SNR is above the first SNR threshold; and
  
  code for increasing an energy threshold for classifying the current frame as unvoiced when the noise estimate exceeds a noise estimate threshold, wherein the voicing threshold and the energy threshold is not adjusted if the noise estimate is below the noise estimate threshold; and
  
  code for determining a speech mode classification based on the first voicing threshold and the energy threshold.
- View Dependent Claims (43)
- - 43. The computer-program product of claim 42, wherein the code for setting the Normalized Auto-correlation Coefficient Function threshold comprises code for decreasing a second voicing threshold for classifying a current frame as voiced when the SNR fails to exceed a second SNR threshold, wherein the second voicing threshold is not adjusted if the SNR is above the SNR threshold.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Qualcomm, Inc.
Original Assignee
Qualcomm, Inc.
Inventors
Duni, Ethan Robert, Rajendran, Vivek
Primary Examiner(s)
COLUCCI, MICHAEL C

Application Number

US13/443,647
Publication Number

US 20120303362A1
Time in Patent Office

1,078 Days
Field of Search

704/219, 704/233, 704/221, 704/220, 704/200.1, 455/569.1, 382/260, 381/107, 375/260, 340/572.4
US Class Current

704/219
CPC Class Codes

G10L 19/025   Detection of transients or ...

G10L 19/22   Mode decision, i.e. based o...

G10L 25/78   Detection of presence or ab...

G10L 25/93   Discriminating between voic...

Noise-robust speech coding mode classification

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

25 Citations

43 Claims

Specification

Solutions

Use Cases

Quick Links

Noise-robust speech coding mode classification

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

25 Citations

43 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links