Decreasing noise sensitivity in speech processing under adverse conditions

US 20030033143A1
Filed: 08/13/2001
Published: 02/13/2003
Est. Priority Date: 08/13/2001
Status: Abandoned Application

First Claim

Patent Images

1. A method comprising:

determining signal attributes and noise attributes of at least two signal portions including speech; and

deriving a distance measure for one signal portion by using the signal attributes of both signal portions.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

To perform reliable speech or speaker recognition (e.g., verification or identification) in adverse conditions, such as noisy environments, a noise compensation mechanism increases noise robustness while speech processing by decreasing noise sensitivity. Signal attributes and noise attributes of at least two signal portions including speech may be determined. Using the signal attributes of both signal portions, a distance measure for one signal portion by using the signal attributes of both signal portions may be derived. In one embodiment, using a Parallel Model Combination (PMC) algorithm, a normalized absolute distance score may be obtained for a noisy speech signal including an utterance. For accurate rejection or acceptance of speech or speaker (registered speakers or imposters), the normalized absolute distance score may be compared to a dynamic threshold or one or more speech or speaker profiles.

Citations

30 Claims

1. A method comprising:
- determining signal attributes and noise attributes of at least two signal portions including speech; and
  
  deriving a distance measure for one signal portion by using the signal attributes of both signal portions.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1, wherein deriving the distance measure including deriving a relative noise measure between the at least two signal portions by distributing the signal attributes over the at least two signal portions.
  - 3. The method of claim 2, including:
    - receiving training speech data including noise components and the at least two signal portions;
      
      combining the signal attributes of the at least two signal portions into a signal content and combining the signal and noise attributes of the at least two signal portions into a signal and noise content;
      
      calculating a compensation ratio of the signal and noise content to the signal content in order to derive the relative noise measure; and
      
      adjusting a mismatch indicative of a noise differential between the noise components present in the training speech data and the noise attributes present in the at least two signal portions based on the relative noise measure.
  - 4. The method of claim 3, including deriving from a training template, a signal profile based on a model trained on the training speech data to determine the mismatch between the noise components and the noise attributes.
  - 5. The method of claim 4, including compensating the model in response to the relative noise measure while applying a parallel model combination mechanism.

6. A method comprising:
- extracting from a noisy speech signal an utterance, said noisy speech signal including a first portion with first signal-and-noise attributes and a second portion with second signal-and-noise attributes, wherein said utterance extracted from the noisy speech signal based on a first model trained on training speech data;
  
  selectively combining across the noisy speech signal the first and second signal-and-noise attributes of both the first and second portions to derive a compensation term for the first model;
  
  deriving a second model by compensating the first model based on the compensation term; and
  
  correcting a mismatch indicative of a noise differential between the first portion and the second portion based on the second model.
- View Dependent Claims (7, 8, 9, 10)
- - 7. The method of claim 6, including using a parallel model combination mechanism to determine said mismatch as a function of the compensation term, said first model based on a plurality of recognition models including at least one speech model and at least one noise model.
  - 8. The method of claim 7, including training the at least one speech model and the at least one noise model with the training speech data.
  - 9. The method of claim 6, wherein combining includes generating absolute scores for the first and second signal-and-noise attributes of both the first and second portions of the noisy speech signal.
  - 10. The method of claim 7, wherein combining further includes:
    - normalizing the absolute scores to generate normalized absolute scores for the first and second signal-and-noise attributes of both the first and second portions of the noisy speech signal; and
      
      calculating the compensation term from the normalized absolute scores.

11. An article comprising a medium storing instructions that enable a processor-based system to:
- determine signal attributes and noise attributes of at least two signal portions including speech; and
  
  derive a distance measure for one signal portion by using the signal attributes of both signal portions.
- View Dependent Claims (12, 13, 14, 15)
- - 12. The article of claim 11, further storing instructions that enable the processor-based system to:
    - derive the distance measure by determining a relative noise measure between the at least two signal portions to distribute the signal attributes over the at least two signal portions.
  - 13. The article of claim 12, further storing instructions that enable the processor-based system to:
    - receive training speech data including noise components and the at least two signal portions;
      
      combine the signal attributes of the at least two signal portions into a signal content and combine the signal and noise attributes of the at least two signal portions into a signal and noise content;
      
      calculate a compensation ratio of the signal and noise content to the signal content in order to derive the relative noise measure; and
      
      adjust a mismatch indicative of a noise differential between the noise components present in the training speech data and the noise attributes present in the at least two signal portions based on the relative noise measure.
  - 14. The article of claim 13, further storing instructions that enable the processor-based system to derive from a training template, a signal profile based on a model trained on the training speech data to determine the mismatch between the noise components and the noise attributes.
  - 15. The article of claim 14, further storing instructions that enable the processor-based system to compensate the model in response to the relative noise measure while applying a parallel model combination mechanism.

16. An article comprising a medium storing instructions that enable a processor-based system to:
- extract from a noisy speech signal an utterance, said noisy speech signal including a first portion with first signal-and-noise attributes and a second portion with second signal-and-noise attributes, wherein said utterance extracted from the noisy speech signal based on a first model trained on training speech data;
  
  selectively combine across the noisy speech signal the first and second signal-and-noise attributes of both the first and second portions to derive a compensation term for the first model;
  
  derive a second model by compensating the first model based on the compensation term; and
  
  correct a mismatch indicative of a noise differential between the first portion and the second portion based on the second model.
- View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26)
- - 17. The article of claim 16, further storing instructions that enable the processor-based system to use a parallel model combination mechanism to determine said mismatch as a function of the compensation term, said first model based on a plurality of recognition models including at least one speech model and at least one noise model.
  - 18. The article of claim 17, further storing instructions that enable the processor-based system to train the at least one speech model and the at least one noise model with the training speech data.
  - 19. The article of claim 16, further storing instructions that enable the processor-based system to generate absolute scores for the first and second signal-and-noise attributes of both the first and second portions of the noisy speech signal.
  - 20. The article of claim 17, further storing instructions that enable the processor-based system to combine further includes:
    - normalize the absolute scores to generate normalized absolute scores for the first and second signal-and-noise attributes of both the first and second portions of the noisy speech signal; and
      
      calculate the compensation term from the normalized absolute scores.
  - 21. The article of claim 20, further storing instructions that enable the processor-based system to:
    - compare the normalized absolute scores with a threshold associated with a speech profile to verify a speaker of the utterance against the speech profile; and
      
      compare the normalized absolute scores with a database including a plurality of speech profiles associated with one or more registered speakers to identify the speaker of the utterance against the database.
  - 22. The article of claim 20, further storing instructions that enable the processor-based system to calculate includes:
    - use a training template including a plurality of frames each frame including one or more channels each channel including first segments with lower signal-to-noise portions and second segments with higher signal-to-noise portions; and
      
      compensate the model for the mismatch in the utterance and the training template based on the compensation term by counting over all the frames of the plurality of frames both the first segments with lower signal-to-noise portions and the second segments with higher signal-to-noise portions in the utterance of the noisy speech signal.
  - 23. The article of claim 22, further storing instructions that enable the processor-based system to derive the compensation term from the mismatch by using a ratio of the total number of the first and second segments to the second segments.
  - 24. The article of claim 23, further storing instructions that enable the processor-based system to:
    - extract from the first segments non-masked coefficients for each channel of the one or more channels of each frame of the plurality of frames of the training template; and
      
      extract from the second segments masked coefficients for each channel of the one or more channels of each frame of the plurality of frames of the training template.
  - 25. The article of claim 24, further storing instructions that enable the processor-based system to extract from the first segments by counting the number of non-masked coefficients over all the frames of the plurality of the frames, and to extract from the second segments by counting the number of masked coefficients for each frame of the plurality of the frames on a frame-by-frame basis.
  - 26. The article of claim 24, further storing instructions that enable the processor-based system to extract from the first and second segments by counting the number of corresponding masked and non-masked coefficients associated with a log-filter bank.

27. An apparatus comprising:
- an audio interface to receive at least two signal portions including speech; and
  
  a control unit operably coupled to the audio interface, the control unit to determine signal attributes and noise attributes of the at least two signal portions including speech and to derive a distance measure for one signal portion by using the signal attributes of both signal portions.
- View Dependent Claims (28)
- - 28. The apparatus of claim 27, further comprising:
    - a storage unit including an authentication database, said storage unit coupled to the control unit to store training speech data in the authentication database, wherein the control unit to;
      
      derive the distance measure from a relative noise measure between the at least two signal portions by distributing the signal attributes over the at least two signal portions. receive training speech data including noise components and the at least two signal portions to calculate a mismatch indicative of a noise differential between the noise components present in the training speech data and the noise attributes present in the at least two signal portions;
      
      combine the signal attributes of the at least two signal portions into a signal content and combining the signal and noise attributes of the at least two signal portions into a signal and noise content to calculate a compensation ratio of the signal and noise content to the signal content; and
      
      adjust the mismatch with the compensation ratio in order to assess the speech based on the relative noise measure.

29. A wireless device comprising:
- an audio interface to receive a noisy speech signal including an utterance;
  
  a control unit operably coupled to the audio interface; and
  
  a storage unit operably coupled to the control unit, said control unit enables;
  
  determining signal attributes and noise attributes of at least two signal portions including speech, and deriving a distance measure for one signal portion by using the signal attributes of both signal portions.
- View Dependent Claims (30)
- - 30. The wireless device of claim 29 comprises a radio transceiver and a communication interface both adapted to communicate over an air interface.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Intel Corporation
Original Assignee
Intel Corporation
Inventors
Aronowitz, Hagai

Application Number

US09/928,766
Publication Number

US 20030033143A1
Time in Patent Office

Days
Field of Search
US Class Current

704/233
CPC Class Codes

G10L 15/20   Speech recognition techniqu...

G10L 17/12   Score normalisation

G10L 21/0216   characterised by the method...

Decreasing noise sensitivity in speech processing under adverse conditions

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

30 Claims

Specification

Solutions

Use Cases

Quick Links

Decreasing noise sensitivity in speech processing under adverse conditions

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

30 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links