MICROPHONE-ARRAY-BASED SPEECH RECOGNITION SYSTEM AND METHOD

US 20130030803A1
Filed: 10/12/2011
Published: 01/31/2013
Est. Priority Date: 07/26/2011
Status: Active Grant

First Claim

Patent Images

1. A microphone-array-based speech recognition system, combining a noise masking module for cancelling noise of input speech signals from an array of microphones, according to an inputted threshold, and comprising:

at least a speech model and at least a filler model that receive respectively a noise-cancelled speech signal outputted by said noise masking module;

a confidence measure score computation module that computes a confidence measure score with said at least a speech model and said at least a filler model for said threshold and the noise-cancelled speech signal; and

a threshold adjustment module that adjusts said threshold and provides to said noise masking module to continue the noise cancelling for achieving a maximum confidence measure score computed by said confidence measure score computation module, thereby outputting a speech recognition result related to said maximum confidence measure score.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A microphone-array-based speech recognition system combines a noise cancelling technique for cancelling noise of input speech signals from an array of microphones, according to at least an inputted threshold. The system receives noise-cancelled speech signals outputted by a noise masking module through at least a speech model and at least a filler model, then computes a confidence measure score with the at least a speech model and the at least a filler model for each threshold and each noise-cancelled speech signal, and adjusts the threshold to continue the noise cancelling for achieving a maximum confidence measure score, thereby outputting a speech recognition result related to the maximum confidence measure score.

46 Citations

View as Search Results

20 Claims

1. A microphone-array-based speech recognition system, combining a noise masking module for cancelling noise of input speech signals from an array of microphones, according to an inputted threshold, and comprising:
- at least a speech model and at least a filler model that receive respectively a noise-cancelled speech signal outputted by said noise masking module;
  
  a confidence measure score computation module that computes a confidence measure score with said at least a speech model and said at least a filler model for said threshold and the noise-cancelled speech signal; and
  
  a threshold adjustment module that adjusts said threshold and provides to said noise masking module to continue the noise cancelling for achieving a maximum confidence measure score computed by said confidence measure score computation module, thereby outputting a speech recognition result related to said maximum confidence measure score.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The speech recognition system as claimed in claim 1, wherein said threshold adjustment module finds a threshold corresponding to said maximum confidence measure score by applying an expectation-maximization algorithm.
  - 3. The speech recognition system as claimed in claim 1, said system compares the similarity of said noise-cancelled speech signal with each of said at least a speech model and obtains a score for each said speech model, and compares the similarity of said noise-cancelled speech signal with at least a filler model and obtains a score for said at least a filler model, and said confidence measure score computation module substrates the score for said at least a filler model from a score function value for said at least a speech model, and takes the difference as said confidence measure score.
  - 4. The speech recognition system as claimed in claim 2, wherein said at least a speech model includes N speech models, said threshold adjustment module takes the M highest scores of M speech models among said N speech models and then gives each of said M speech models different weights to find the threshold corresponding to said maximum confidence measure score, wherein N and M are all positive integers and M≦
    - N.
  - 5. The speech recognition system as claimed in claim 2, wherein said at least a speech model includes a plurality of speech models, and said threshold adjustment module takes a score of a combined speech model merged from each of said a plurality of speech models to find the threshold corresponding to said maximum confidence measure score.
  - 6. The speech recognition system as claimed in claim 2, wherein said at least a speech model includes a plurality of speech models, and said threshold adjustment module takes the maximum score for each of said a plurality of speech models to find the threshold corresponding to said maximum confidence measure score.
  - 7. The speech recognition system as claimed in claim 1, said speech recognition system further comprises at least a processor to complete functional implementation of said at least a speech model, said at least a filler model, said confidence measure score computation module, and said threshold adjustment module.
  - 8. The speech recognition system as claimed in claim 1, wherein said at least a speech model, said at least a filler model, said confidence measure score computation module, and said threshold adjustment module are realized by at least an integrated circuits.

9. A microphone-array-based speech recognition system combining a noise masking module for cancelling noise of input speech signals from an array of microphones, according to each of a plurality of given thresholds within a predetermined range, and comprising:
- at least a speech model and at least a filler model that receive respectively a noise-cancelled speech signals after said cancelling noise;
  
  a confidence measure score computation module that computes a confidence measure score with said at least a speech model and said at least a filler model for each given threshold within said predetermined range and said noise-cancelled speech signals; and
  
  a maximum confidence measure score decision module that determines a maximum confidence measure score from all confidence measure scores computed by said confidence measure score computation module and obtains a threshold corresponding to said maximum confidence measure score, and outputs corresponding speech recognition result.
- View Dependent Claims (10, 11, 12)
- - 10. The speech recognition system as claimed in claim 9, said speech recognition system further comprises at least a processor to complete functional implementation of said at least a speech model, said at least a filler model, said confidence measure score computation module, and said maximum confidence measure score decision module.
  - 11. The speech recognition system as claimed in claim 9, wherein said at least a speech model, said at least a filler model, said confidence measure score computation module, and said maximum confidence measure score decision module are realized by at least an integrated circuits.
  - 12. The speech recognition system as claimed in claim 9, said speech recognition system finds the threshold corresponding to said maximum confidence measure score by applying a linear search method.

13. A microphone-array-based speech recognition method implemented by a computer system, said method comprising following acts executed by said computer system:
- executing noise cancelling of input speech signals from an array of microphones according to at least an inputted threshold, and transmitting a noise-cancelled speech signal to at least a speech model and at least a filler model respectively;
  
  computing a corresponding confidence measure score based on score information for each of said at least a speech model and a score for said at least a filler model;
  
  from each of said at least an inputted threshold, finding a threshold corresponding to a maximum confidence measure score among all computed confidence measure scores, and generating speech recognition result.
- View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
- - 14. The speech recognition method as claimed in claim 13, said finds the threshold corresponding to said maximum confidence measure score by applying an expectation-maximization algorithm.
  - 15. The speech recognition method as claimed in claim 13, said method finds the threshold corresponding to said maximum confidence measure score by applying a linear search scheme.
  - 16. The speech recognition method as claimed in claim 13, said method substrates said score for said at least a filler model from each score function value for said at least a speech model for each of said at least an inputted threshold, and takes obtained difference as each corresponding confidence measure score.
  - 17. The speech recognition method as claimed in claim 14, said method combines said at least a speech model into a combined model to increase robustness.
  - 18. The speech recognition method as claimed in claim 14, wherein said at least a speech model includes N speech models, and said method takes M highest scores of M speech models among said N speech models and then gives each of said M speech models different weights to find the threshold corresponding to said maximum confidence measure score, wherein N and M are all positive integers and M≦
    - N.
  - 19. The speech recognition method as claimed in claim 14, wherein said at least a speech model comprises a plurality of speech models, and said method takes the maximum confidence measure score for each of said a plurality of speech models as a score function value for said at least a speech model.
  - 20. The speech recognition method as claimed in claim 15, said method combines said at least a speech model into a combined model to increase robustness.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Industrial Technology Research Institute
Original Assignee
Industrial Technology Research Institute
Inventors
Liao, Hsien-Cheng

Granted Patent

US 8,744,849 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/233
CPC Class Codes

G10L 15/142   Hidden Markov Models [HMMs]

G10L 15/20   Speech recognition techniqu...

G10L 2021/02166   Microphone arrays; Beamforming

G10L 21/0208   Noise filtering

H04R 1/406   microphones

H04R 3/005   for combining the signals o...

MICROPHONE-ARRAY-BASED SPEECH RECOGNITION SYSTEM AND METHOD

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

46 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

MICROPHONE-ARRAY-BASED SPEECH RECOGNITION SYSTEM AND METHOD

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

46 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links