Use of periodicity and jitter for automatic speech recognition

US 6,055,499 A
Filed: 05/01/1998
Issued: 04/25/2000
Est. Priority Date: 05/01/1998
Status: Expired due to Fees

First Claim

Patent Images

1. A method of speech recognition comprising the step of:

a. starting with a standard feature vector;

b. including at least one voicing feature and at least one time derivative of at least one voicing feature with said standard feature vector; and

c. using said standard feature vector with said included features to recognize speech.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A class of features related to voicing parameters that indicate whether the vocal chords are vibrating. Features describing voicing characteristics of speech signals are integrated with an existing 38-dimensional feature vector consisting of first and second order time derivatives of the frame energy and of the cepstral coefficients with their first and second derivatives. Hidden Markov Model (HMM)-based connected digit recognition experiments comparing the traditional and extended feature sets show that voicing features and spectral information are complementary and that improved speech recognition performance is obtained by combining the two sources of information.

18 Citations

View as Search Results

18 Claims

1. A method of speech recognition comprising the step of:
- a. starting with a standard feature vector;
  
  b. including at least one voicing feature and at least one time derivative of at least one voicing feature with said standard feature vector; and
  
  c. using said standard feature vector with said included features to recognize speech.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The method of claim 1, wherein said standard feature vector includes time derivatives of energy.
  - 3. The method of claim 1, wherein said standard feature vector includes spectral features.
  - 4. The method of claim 3, wherein said spectral features include cepstral coefficients.
  - 5. The method of claim 1, after step b and before step c further comprising the step of including an energy feature with said standard feature vector and said voicing features.
  - 6. The method of claim 5, wherein said standard feature vector has 38 features, said energy feature has one feature and said voicing features has five features.
  - 7. The method of claim 1, wherein a sum of the features of said standard feature vector and said voicing features is 44.
  - 8. The method of claim 1, wherein said voicing features are from a group of features including periodicity features and jitter features.
  - 9. The method of claim 1, wherein said voicing features includes periodicity features.
  - 10. The method of claim 1, wherein said voicing features includes jitter features.
  - 11. The method of claim 1, wherein said voicing features includes periodicity features and jitter features.
  - 12. The method of claim 1, wherein step b further includes the steps of:
    - normalizing an autocorrelation function with a peak at m=0 so that a ratio lies in a normalized range between 0 and 1; and
      
      choosing a largest peak in the normalized autocorrelation function as the estimate of a pitch period and a corresponding frequency value of said pitch period is a measure of a periodicity feature.

13. An apparatus for speech recognition, comprising:
- means for determining a standard feature vector;
  
  means for storing a standard feature vector after it has been determined;
  
  means for including at least one voicing feature and at least one time derivative of at least one voicing feature with said stored standard feature vector; and
  
  means for using said stored standard feature vector and said included voicing features to recognize speech;
  
  wherein an error rate for speech recognition is reduced because of a robustness resulting from including the at least one voicing feature and the at least one time derivative of at least one voicing feature.
- View Dependent Claims (14, 15, 16, 17, 18)
- - 14. The apparatus for speech recognition according to claim 13, wherein said speech recognition is less sensitive to differences in transmission conditions because of including said at least one voicing feature and said at least one time derivative of at least one voicing feature.
  - 15. The apparatus for speech recognition according to claim 14, wherein said at least one voicing feature includes periodicity features and jitter features.
  - 16. The apparatus for speech recognition according to claim 13, wherein said at least one voicing feature includes periodicity features and jitter features.
  - 17. The apparatus for speech recognition according to claim 13, wherein said apparatus has been trained to recognize a string of words using a minimum string error training.
  - 18. The apparatus for speech recognition according to claim 13, wherein said apparatus has been trained to recognize at least one word using maximum likelihood training.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Lucent Technologies, Inc. (Nokia Corporation)
Original Assignee
Lucent Technologies, Inc. (Nokia Corporation)
Inventors
Chengalvarayan, Rathinavelu, Thomson, David Lynn
Primary Examiner(s)
Dorvil, Richemond

Application Number

US09/071,214
Time in Patent Office

725 Days
Field of Search

704/250, 704/205, 704/206, 704/207, 704/208, 704/217, 704/216, 704/256, 704/255
US Class Current

704/250
CPC Class Codes

G10L 15/02 Feature extraction for spee...

Use of periodicity and jitter for automatic speech recognition

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

18 Citations

18 Claims

Specification

Use Cases

Quick Links

Others

Use of periodicity and jitter for automatic speech recognition

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

18 Citations

18 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others