Noise-compensated speech recognition templates

US 6,381,569 B1
Filed: 02/04/1998
Issued: 04/30/2002
Est. Priority Date: 02/04/1998
Status: Expired due to Term

First Claim

Patent Images

1. A speech recognition system, comprising:

a training unit for receiving signals of words or phrases to be trained, generating digitized samples for each said words or phrases, and storing said digitized samples in a speech database; and

a speech recognition unit for receiving an input signal to be recognized, the input signal being corrupted by noise, generating a noise compensated template database by applying the effects of said noise to said digitized samples of said speech database upon receiving said input signal, and providing a speech recognition outcome for said input signal based on said noise compensated template database, wherein said speech recognition unit comprises a speech detection unit for receiving said noise corrupted input signal and determining whether speech is present in said input signal, wherein said input signal is designated a noise signal when speech is determined not to be present in said input signal; and

a noise unit activated upon determining that speech is not present in said input signal, said noise unit for analyzing said noise signal and synthesizing a synthesized noise signal having characteristics of said noise signal, said synthesized noise signal for applying the effects of noise to said digitized samples of said speech database.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The speech recognition training unit is modified to store digitized speech samples into a speech database that can be accessed at recognition time. The improved recognition unit comprises a noise analysis, modeling, and synthesis unit which continually analyzes the noise characteristics present in the audio environment and produces an estimated noise signal with similar characteristics. The recognition unit then constructs a noise-compensated template database by adding the estimated noise signal to each of the speech samples in the speech database and performing parameter determination on the resulting sums. This procedure accounts for the presence of noise in the recognition phase by retraining all the templates using an estimated noise signal with similar characteristics as the actual noise signal that corrupted the word to be recognized. This method improves the likelihood of a good template match, which increases the recognition accuracy.

40 Citations

View as Search Results

15 Claims

1. A speech recognition system, comprising:
- a training unit for receiving signals of words or phrases to be trained, generating digitized samples for each said words or phrases, and storing said digitized samples in a speech database; and
  
  a speech recognition unit for receiving an input signal to be recognized, the input signal being corrupted by noise, generating a noise compensated template database by applying the effects of said noise to said digitized samples of said speech database upon receiving said input signal, and providing a speech recognition outcome for said input signal based on said noise compensated template database, wherein said speech recognition unit comprises a speech detection unit for receiving said noise corrupted input signal and determining whether speech is present in said input signal, wherein said input signal is designated a noise signal when speech is determined not to be present in said input signal; and
  
  a noise unit activated upon determining that speech is not present in said input signal, said noise unit for analyzing said noise signal and synthesizing a synthesized noise signal having characteristics of said noise signal, said synthesized noise signal for applying the effects of noise to said digitized samples of said speech database.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The speech recognition system of claim 1, wherein said speech detection unit determines the presence of speech by analyzing the level of speech activity in said input signal.
  - 3. The speech recognition system of claim 1, wherein said noise unit analyzes and synthesizes said synthesized noise signal using a linear predictive coding (LPC) technique.
  - 4. The speech recognition system of claim 1, wherein said synthesized noise signal corresponds to a window of said noise signal recorded right before said input signal to be recognized.
  - 5. The speech recognition system of claim 1, wherein said synthesized noise signal corresponds to an average of various windows of said noise signal recorded over a predetermined period of time.
  - 6. The speech recognition system of claim 1 wherein said speech recognition unit further comprises:
7. The speech recognition system of claim 6, wherein said parameter determination technique is a linear predictive coding (LPC) analysis technique.
8. The speech recognition system of claim 6, wherein said speech detection unit determines the presence of speech by analyzing the level of speech activity in said input signal.
9. The speech recognition system of claim 6, wherein said noise unit analyzes and synthesizes said synthesized noise signal using a linear predictive coding (LPC) technique.
10. The speech recognition system of claim 6, wherein said synthesized noise signal corresponds to a window of said noise signal recorded right before said input signal to be recognized.
11. The speech recognition system of claim 6, wherein said synthesized noise signal corresponds to an average of various windows of said noise signal recorded over a predetermined period of time.

12. A speech recognition unit of a speaker-dependent speech recognition system for recognizing an input signal, said speech recognition unit accounting for effects of a noisy environment, comprising:
- means for storing digitized samples of words or phrases of a training vocabulary in a speech database;
  
  means for applying the effects of noise associated with said input signal to digitized samples of said training vocabulary to generate noise corrupted digitized samples of said training vocabulary;
  
  means for generating a noise compensated template database based on said noise corrupted digitized samples; and
  
  means for determining a speech recognition outcome for said input signal based on said noise compensated template database, wherein said means for applying effects of noise comprises means for determining whether speech is present in said input signal, wherein said input signal is designated a noise signal when speech is determined not to be present in said input signal; and
  
  means for analyzing said noise signal and synthesizing a synthesized noise signal, said synthesized noise signal added to said digitized samples of said vocabulary.
- View Dependent Claims (13)
- - 13. The speech recognition unit of claim 12, further comprising:

14. A method for speech recognition accounting for the effects of a noisy environment, comprising the steps of:
- generating digitized samples of each word or phrase trained, each said word or phrase belonging to a vocabulary;
  
  storing said digitized samples in a speech database;
  
  receiving a noise corrupted input signal to be recognized;
  
  applying the effects of noise associated with said input signal to said digitized samples of said vocabulary to generate noise corrupted digitized samples of said vocabulary, said applying being performed upon receiving said input signal;
  
  generating a noise compensated template database based on said noise corrupted digitized samples; and
  
  providing a speech recognition outcome for said noise corrupted input signal based on said noise compensated template database;
  
  wherein said step of applying the effects of noise comprises the steps of determining whether speech is present in said input signal, wherein said input signal is designated a noise signal when speech is determined not to be present in said input signal; and
  
  analyzing said noise signal and synthesizing a synthesized noise signal, said synthesized noise signal added to said digitized samples of said vocabulary to generate said noise corrupted digitized samples.
- View Dependent Claims (15)
- - 15. The method of speech recognition of claim 14, further comprising the steps of:

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Qualcomm, Inc.
Original Assignee
Qualcomm, Inc.
Inventors
Sih, Gilbert C., Bi, Ning
Primary Examiner(s)
Knepper, David D.

Application Number

US09/018,257
Time in Patent Office

1,546 Days
Field of Search

704/233, 704/243, 704/244, 704/226-228
US Class Current

704/233
CPC Class Codes

G10L 15/20 Speech recognition techniqu...

G10L 21/0216 characterised by the method...

Noise-compensated speech recognition templates

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

40 Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Noise-compensated speech recognition templates

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

40 Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links