Two stage utterance verification device and method thereof in speech recognition system

US 7,529,665 B2
Filed: 04/01/2005
Issued: 05/05/2009
Est. Priority Date: 12/21/2004
Status: Expired due to Fees

First Claim

Patent Images

1. A method for two stage utterance verification method, comprising the steps of:

a) performing a first utterance verification function based on a support vector machine (SVM) pattern classification method by using feature data inputted from a search block of a speech recognizer;

b) determining whether a confidence score, which is a result value of the first utterance verification function, is a misrecognition level for deciding rejection of a speech recognition result;

c) performing a second utterance verification function based on a classification and regression tree (CART) pattern classification method by using heterogeneity feature data including meta data extracted from a preprocessing module, intermediate results from function blocks of the speech recognizer and the result of the first utterance verification function when the speech recognition result is accepted by the first utterance verification function, and returning when the speech recognition result is rejected by the first utterance verification function; and

d) determining whether the speech recognition result is misrecognition based on a result of the second utterance verification function, transferring the speech recognition result to a system response module when the speech recognition result is accepted by the second utterance verification, and returning when the speech recognition result is rejected by the second utterance verification.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A two stage utterance verification device and a method thereof are provided. The two stage utterance verification method includes performing a first utterance verification function based on a SVM pattern classification method by using feature data inputted from a search block of a speech recognizer and performing a second utterance verification function based on a CART pattern classification method by using heterogeneity feature data including meta data extracted from a preprocessing module, intermediate results from function blocks of the speech recognizer and the result of the first utterance verification function. Therefore, the two state utterance verification device and the method thereof provide a high quality speech recognition service to a user.

Citations

6 Claims

1. A method for two stage utterance verification method, comprising the steps of:
- a) performing a first utterance verification function based on a support vector machine (SVM) pattern classification method by using feature data inputted from a search block of a speech recognizer;
  
  b) determining whether a confidence score, which is a result value of the first utterance verification function, is a misrecognition level for deciding rejection of a speech recognition result;
  
  c) performing a second utterance verification function based on a classification and regression tree (CART) pattern classification method by using heterogeneity feature data including meta data extracted from a preprocessing module, intermediate results from function blocks of the speech recognizer and the result of the first utterance verification function when the speech recognition result is accepted by the first utterance verification function, and returning when the speech recognition result is rejected by the first utterance verification function; and
  
  d) determining whether the speech recognition result is misrecognition based on a result of the second utterance verification function, transferring the speech recognition result to a system response module when the speech recognition result is accepted by the second utterance verification, and returning when the speech recognition result is rejected by the second utterance verification.
- View Dependent Claims (2)
- - 2. The method of claim 1, wherein the heterogeneity feature data of the step c) includes a SNR, an energy, a gender, an age, a phonetic structure, a dialect, the number of syllables in a word, the number of phonemes in a word, the number of frames in a word, a speaking rate, an average pitch, an utterance duration, a speech absent probability, a speech/non-speech likelihood, a Kalman shrinking factor, a Wiener shrinking factor, a N-best LLR score, an anti-model LLR score, a filter bank SNR, a LLR driven score, a SVM confidence score, a beam width during searching, a search time, a EPD time, a time for using system and a domain.

3. A two stage utterance verification device in a speech recognition system, the device comprising:
- a speech input/output unit for inputting/outputting speech;
  
  a preprocessing module for receiving the speech from the speech input/output unit and extracting meta data from the received speech;
  
  a speech recognizer for performing speech recognition after receiving the meta data from the preprocessing module; and
  
  a utterance verification unit for performing a first utterance verification function based on feature data inputted from a search block of the speech recognizer by using a support vector machine (SVM) pattern classification method, performing a second utterance verification function based on a classification and regression tree (CART) pattern classification method by using heterogeneity feature data including the meta data extracted from the preprocessing module, intermediate result values from functional blocks of the speech recognizer and the result of the first utterance verification when a confidence score, which is a result of the first utterance verification function, is accepted as a correct recognition, and transferring the speech recognition result to a system response module when a result of the second utterance verification function is accepted as the correct recognition.
- View Dependent Claims (4)
- - 4. The two stage utterance verification device of claim 3, wherein the heterogeneity feature data includes a SNR, an energy, a gender, an age, a phonetic structure, a dialect, the number of syllables in a word, the number of phonemes in a word, the number of frames in a word, a speaking rate, an average pitch, an utterance duration, a speech absent probability, a speech/non-speech likelihood, a Kalman shrinking factor, a Wiener shrinking factor, a N-best LLR score, an anti-model LLR score, a filter bank SNR, a LLR driven score, a SVM confidence score, a beam width during searching, a search time, a EPD time, a time for using system and a domain.

5. A method for two stage utterance verification method, comprising the steps of:
- a) performing a first utterance verification function based on a support vector machine (SVM) pattern classification method by using feature data inputted from a search block of a speech recognizer;
  
  b) determining whether a confidence score, which is a result value of the first utterance verification function, is a misrecognition level for deciding rejection of a speech recognition result;
  
  c) performing a second utterance verification function based on a classification and regression tree (CART) pattern classification method by using heterogeneity feature data including meta data extracted from a preprocessing module, intermediate results from function blocks of the speech recognizer and the result of the first utterance verification function when the speech recognition result is accepted by the first utterance verification function, and returning when the speech recognition result is rejected by the first utterance verification function; and
  
  d) determining whether the speech recognition result is misrecognition based on a result of the second utterance verification function, transferring the speech recognition result to a system response module when the speech recognition result is accepted by the second utterance verification, and returning when the speech recognition result is rejected by the second utterance verification;
  
  wherein the heterogeneity feature data of step c) includes a SNR, an energy, a gender, an age, a phonetic structure, a dialect, the number of syllables in a word, the number of phonemes in a word, the number of frames in a word, a speaking rate, an average pitch, an utterance duration, a speech absent probability, a speech/non-speech likelihood, a Kalman shrinking factor, a Wiener shrinking factor, a N-best LLR score, an anti-model LLR score, a filter bank SNR, a LLR driven score, a SVM confidence score, a beam width during searching, a search time, a EPD time, a time for using system and a domain.

6. A two stage utterance verification device in a speech recognition system, the device comprising:
- a speech input/output unit for inputting/outputting speech;
  
  a preprocessing module for receiving the speech from the speech input/output unit and extracting meta data from the received speech;
  
  a speech recognizer for performing speech recognition after receiving the meta data from the preprocessing module; and
  
  a utterance verification unit for performing a first utterance verification function based on feature data inputted from a search block of the speech recognizer by using a support vector machine (SVM) pattern classification method, performing a second utterance verification function based on a classification and regression tree (CART) pattern classification method by using heterogeneity feature data including the meta data extracted from the preprocessing module, intermediate result values from functional blocks of the speech recognizer and the result of the first utterance verification when a confidence score, which is a result of the first utterance verification function, is accepted as a correct recognition, and transferring the speech recognition result to a system response module when a result of the second utterance verification function is accepted as the correct recognition;
  
  wherein the heterogeneity feature data includes a SNR, an energy, a gender, an age, a phonetic structure, a dialect, the number of syllables in a word, the number of phonemes in a word, the number of frames in a word, a speaking rate, an average pitch, an utterance duration, a speech absent probability, a speech/non-speech likelihood, a Kalman shrinking factor, a Wiener shrinking factor, a N-best LLR score, an anti-model LLR score, a filter bank SNR, a LLR driven score, a SVM confidence score, a beam width during searching, a search time, a EPD time, a time for using system and a domain.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Electronics and Telecommunications Research Institute
Original Assignee
Electronics and Telecommunications Research Institute
Inventors
Kim, Sanghun, Lee, YoungJik
Primary Examiner(s)
Hudspeth; David R
Assistant Examiner(s)
Rider; Justin W

Application Number

US11/095,555
Publication Number

US 20060136207A1
Time in Patent Office

1,495 Days
Field of Search

704/236
US Class Current

704/236
CPC Class Codes

G10L 15/08 Speech classification or se...

Two stage utterance verification device and method thereof in speech recognition system

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

6 Claims

Specification

Solutions

Use Cases

Quick Links

Two stage utterance verification device and method thereof in speech recognition system

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

6 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links