Speech recognition using microphone antenna array

US 20030069727A1
Filed: 10/02/2001
Published: 04/10/2003
Est. Priority Date: 10/02/2001
Status: Active Grant

First Claim

Patent Images

1. A method of recognizing speech, comprising:

receiving speech input via a plurality of microphones;

processing a corresponding plurality of audio signals from said microphones with a beamforming network to generate a first improved audio signal;

detecting voice activity on said first improved audio signal;

processing said plurality of audio signals from said microphones with an adaptive noise cancellation filter having variable filter coefficients to generate a second improved audio signal;

repeatedly updating said variable filter coefficients during periods of voice inactivity as detected on said first improved audio signal to minimize non-speech audio components in said second improved audio signal; and

performing speech recognition on said second improved audio signal.

View all claims

8 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method of audio processing provides enhanced speech recognition. Audio input is received at a plurality of microphones. The multi-channel audio signal from the microphones may be processed by a beamforming network to generate a single-channel enhanced audio signal, on which voice activity is detected. Audio signals from the microphones are additionally processed by an adaptable noise cancellation filter having variable filter coefficients to generate a noise-suppressed audio signal. The variable filter coefficients are updated during periods of voice inactivity. A speech recognition engine may apply a speech recognition algorithm to the noise-suppressed audio signal and generate an appropriate output. The operation of the speech recognition engine and the adaptable noise cancellation filter may advantageously be controlled based on voice activity detected in the single-channel enhanced audio signal from the beamforming network.

Citations

17 Claims

1. A method of recognizing speech, comprising:
- receiving speech input via a plurality of microphones;
  
  processing a corresponding plurality of audio signals from said microphones with a beamforming network to generate a first improved audio signal;
  
  detecting voice activity on said first improved audio signal;
  
  processing said plurality of audio signals from said microphones with an adaptive noise cancellation filter having variable filter coefficients to generate a second improved audio signal;
  
  repeatedly updating said variable filter coefficients during periods of voice inactivity as detected on said first improved audio signal to minimize non-speech audio components in said second improved audio signal; and
  
  performing speech recognition on said second improved audio signal.
- View Dependent Claims (2)
- - 2. The method of claim 1, wherein performing speech recognition on said second improved audio signal occurs in response to detecting voice activity on said first improved audio signal.

3. A method of improving voice communications in a wireless communications device having an adaptable audio filter, comprising:
- receiving audio input at a microphone array comprising a plurality of microphones and generating a multi-channel output based thereon;
  
  detecting the presence or absence of a voice component in said multi-channel output;
  
  intermittently updating parameters associated with said adaptable audio filter in response to detection of the absence of a voice component in said multi-channel output; and
  
  subsequently processing said multi-channel output at said adaptable audio filter using said updated parameters to generate an improved audio signal.
- View Dependent Claims (4, 5, 6)
- - 4. The method of claim 3, further comprising performing speech recognition on said improved audio signal.
  - 5. The method of claim 4, wherein performing speech recognition on said improved audio signal occurs in response to detection of the presence of a voice component in said multi-channel output.
  - 6. The method of claim 3, wherein detecting the presence or absence of a voice component in said multi-channel output comprises:
    - processing said multi-channel output with a beamforming network to generate a single channel audio signal; and
      
      performing voice activity detection on said single channel audio signal.

7. A method of speech recognition, comprising:
- receiving audio input at a microphone array comprising a plurality of microphones and generating a multi-channel audio signal;
  
  processing said multi-channel audio signal into a first single channel audio signal via a beamforming network;
  
  detecting voice activity in said first single channel audio signal;
  
  selectively updating parameters associated with an adaptable audio filter; and
  
  timing said updating of parameters based on said voice activity detection.
- View Dependent Claims (8, 9)
- - 8. The method of claim 7, further comprising:
    - processing said multi-channel audio signal into a second single channel audio signal via said adaptable audio filter; and
      
      performing speech recognition on said second single channel audio signal.
  - 9. The method of claim 8, further comprising timing said speech recognition based on said voice activity detection.

10. A method of processing audio input at a microphone array comprising a plurality of microphones, comprising:
- detecting voice activity in said audio input;
  
  controlling, based on said voice activity, the operation of both adaptive noise filtering and speech recognition of said audio input.
- View Dependent Claims (11)
- - 11. The method of claim 10 further comprising:
    - receiving audio input at said microphone array and generating a multi-channel audio signal;
      
      beamforming said multi-channel audio signal into a single channel audio signal;
      
      wherein detecting voice activity in said audio input is performed based on said single channel audio signal to produce a binary signal; and
      
      wherein controlling the operation of both adaptive noise filtering and speech recognition based on said voice activity comprises controlling the operation of both adaptive noise filtering and speech recognition based on said binary signal.

12. A system for speech recognition, comprising:
- a microphone array comprising a plurality of microphones for receiving audio input, said microphone array generating a multi-channel audio signal;
  
  a beamforming network receiving said multi-channel audio signal and generating a first single channel audio signal;
  
  a voice activity detection module receiving said first single channel audio signal and generating a binary signal indicative of the presence or absence of a voice component in said first single channel audio signal;
  
  an adaptive noise cancellation filter having variable filter coefficients receiving said multi-channel audio signal and said binary signal, and generating a second single channel audio signal, said filter coefficients being updated in response to said binary signal;
  
  a speech recognition engine receiving said second single channel audio signal, operative to interpret speech in said second single channel audio signal in response to said binary signal.
- View Dependent Claims (13, 14, 15, 16, 17)
- - 13. The system of claim 12, wherein said filter coefficients are updated in response to said binary signal indicating the absence of a voice component in said first single channel audio signal.
  - 14. The system of claim 12, wherein said speech recognition engine additionally receives said binary signal, and interprets speech in said second single channel audio signal in response to said binary signal indicating the presence of a voice component in said first single channel audio signal.
  - 15. The system of claim 12, wherein said beamforming network, said voice activity detection module, said adaptive noise cancellation filter, and said speech recognition engine are included in a wireless communications mobile terminal.
  - 16. The system of claim 12, wherein one or more of said beamforming network, said voice activity detection module, said adaptive noise cancellation filter, and said speech recognition engine are included in a hands-free adapter capable of connecting to a wireless communications mobile terminal.
  - 17. The system of claim 12, wherein one or more of said beamforming network, said voice activity detection module, said adaptive noise cancellation filter, and said speech recognition engine are implemented in software.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Optis Cellular Technology LLC (Brevet Capital)
Original Assignee
Telefonaktiebolaget LM Ericsson
Inventors
Krasny, Leonid, Khayrallah, Ali, Makovicka, Thomas

Granted Patent

US 6,937,980 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/228
CPC Class Codes

G10L 15/20   Speech recognition techniqu...

G10L 2021/02166   Microphone arrays; Beamforming

G10L 2021/02168   the estimation exclusively ...

G10L 21/0208   Noise filtering

Speech recognition using microphone antenna array

First Claim

8 Assignments

0 Petitions

Accused Products

Abstract

Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition using microphone antenna array

First Claim

8 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links