Tone based speech recognition

US 6,553,342 B1
Filed: 02/02/2000
Issued: 04/22/2003
Est. Priority Date: 02/02/2000
Status: Expired due to Term

First Claim

Patent Images

1. A system for speech recognition comprising:

an input terminal for receiving a segment of speech;

a speech classifier having an input coupled to the input terminal and an output to provide an indication of whether the speech segment comprises voiced or unvoiced speech;

a speech feature detector having a first input coupled to the input terminal, a second input coupled to the output of the of the speech classifier, and an output to provide a speech feature vector having a plurality of feature values indicating features of the speech segment, the speech feature vector including at least a tonal feature value indicating a tonal feature of the speech segment when the speech segment comprises voiced speech; and

a speech recogniser having an input coupled to the output of the speech feature detector and an output to provide an indication of which of a predetermined plurality of speech models is a good match to the speech segment.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and apparatus for speech recognition involves classifying (38) a digitized speech segment according to whether the speech segment comprises voiced or unvoiced speech and utilizing that classification to generate tonal feature vectors (41) of the speech segment when the speech is voiced. The tonal feature vectors are then combined (42) with other non-tonal feature vectors (40) to provide speech feature vectors. The speech feature vectors are compared (35) with previously stored models of speech feature vectors (37) for different segments of speech to determine which previously stored model is a most likely match for the segment to be recognized.

Citations

17 Claims

1. A system for speech recognition comprising:
- an input terminal for receiving a segment of speech;
  
  a speech classifier having an input coupled to the input terminal and an output to provide an indication of whether the speech segment comprises voiced or unvoiced speech;
  
  a speech feature detector having a first input coupled to the input terminal, a second input coupled to the output of the of the speech classifier, and an output to provide a speech feature vector having a plurality of feature values indicating features of the speech segment, the speech feature vector including at least a tonal feature value indicating a tonal feature of the speech segment when the speech segment comprises voiced speech; and
  
  a speech recogniser having an input coupled to the output of the speech feature detector and an output to provide an indication of which of a predetermined plurality of speech models is a good match to the speech segment.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. A system for speech recognition according to claim 1, further comprising an Analog-to-Digital (A/D) converter having an input coupled to the input terminal and an output coupled to the inputs of the speech classifier and the speech feature detector to provide a digitised speech segment.
  - 3. A system for speech recognition according to claim 1, wherein the output of the speech recogniser provides an indication of which one of the predetermined plurality of speech models is a best match to the speech segment.
  - 4. A system for speech recognition according to claim 1, further comprising a memory coupled to the speech recogniser for storing the predetermined plurality of speech models, and a speech model trainer having an input selectively coupled to the output of the speech feature detector and an output coupled to the memory to store in the memory the predetermined plurality of speech models after the predetermined plurality of speech models have been trained using the speech feature vector.
  - 5. A system for speech recognition according to claim 1, wherein the speech feature detector comprises a non-tonal feature detector having an input coupled to the input of the speech feature detector and an output to provide at least one non-tonal feature value for the speech segment, a tonal feature detector having a first input coupled to the input of the speech feature detector, a second input coupled to the output of the speech classifier and an output to provide at least one tonal feature value for the speech segment when the speech classifier determines that the speech segment comprises voiced speech, and a speech feature vector generator having a first input coupled to the output of the non-tonal feature detector, a second input coupled to the output of the tonal feature detector, and an output coupled to the output of the speech feature detector to provide the speech feature vector.
  - 6. A system for speech recognition according to claim 5, wherein the non-tonal feature detector comprises a non-tonal speech transformation circuit having an input coupled to the input of the non-tonal feature detector and an output to provide a transformed non-tonal signal, and a non-tonal feature generator having an input coupled to the output of the non-tonal speech transformation circuit and an output coupled to the output of the non-tonal feature detector to provide the at least one non-tonal feature value for the speech segment.
  - 7. A system for speech recognition according to claim 5, wherein the tonal feature detector comprises a tonal speech transformation circuit having first and second inputs coupled to the first and second inputs of the tonal feature detector and an output to provide a transformed tonal signal, and a tonal feature generator having an input coupled to the output of the tonal speech transformation circuit and an output coupled to the output of the tonal feature detector to provide the at least one tonal feature value for the speech segment.
  - 8. A system for speech recognition according to claim 7, wherein the tonal speech transformation circuit comprises:
9. A system for speech recognition according to claim 8, wherein the tone generator has a second input coupled to the second input of the tonal speech transformation circuit.

10. A method of speech recognition comprising the steps of:
- receiving a segment of speech;
  
  classifying the speech segment according to whether the speech segment comprises voiced or unvoiced speech;
  
  detecting a plurality of speech features of the segment of speech;
  
  generating a speech feature vector having a plurality of feature values indicating the detected plurality of features of the speech segment, wherein the speech feature vector includes at least a tonal feature value indicating a tonal feature of the speech segment when the speech segment comprises voiced speech; and
  
  utilising the speech vector to determine which of a predetermined plurality of speech models is a good match to the speech segment.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17)
- - 11. A method of speech recognition according to claim 10, further comprising the step of digitising the segment of speech to provide a digitised speech segment.
  - 12. A method of speech recognition according to claim 10, wherein the step of utilising the speech vector determines which of the predetermined plurality of speech models is a best match to the speech segment.
  - 13. A method of speech recognition according to claim 10, further comprising the steps of:
14. A method of speech recognition according to claim 10, wherein the step of detecting a plurality of speech features comprises the steps of:
- generating at least one non-tonal feature value for the speech segment;
  
  generating at least one tonal feature value for the speech segment when the speech classifier determines that the speech segment comprises voiced speech; and
  
  combining the at least one non-tonal feature value and the at least one tonal feature value to provide the speech feature vector.
15. A method of speech recognition according to claim 14, wherein the step of detecting at least one non-tonal feature value comprises the steps of:
- transforming the speech segment using at least a first transformation to provide a transformed non-tonal signal; and
  
  generating the at least one non-tonal feature value from the transformed non-tonal signal.
16. A method of speech recognition according to claim 14, wherein the step of detecting at least one tonal feature value comprises the steps of:
- transforming the speech segment using at least a second transformation to provide a transformed tonal signal; and
  
  generating the at least one tonal feature value from the transformed tonal signal.
17. A method of speech recognition according to claim 16, wherein the step of transforming the speech segment comprises the steps of:
- extracting pitch information from the speech segment; and
  
  generating the transformed tonal signal from the extracted pitch information.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google Technology Holdings LLC (Alphabet Inc.)
Original Assignee
Motorola, Inc. (Motorola Solutions, Inc.)
Inventors
Song, Jianming, Madievski, Anton, Zhang, Yaxin
Primary Examiner(s)
Knepper, David D.

Application Number

US09/496,868
Time in Patent Office

1,175 Days
Field of Search

704/200, 704/207, 704/231, 704/236, 704/251, 704/255, 704/257, 704/277
US Class Current

704/255
CPC Class Codes

G10L 15/02 Feature extraction for spee...

G10L 25/15 the extracted parameters be...

Tone based speech recognition

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Tone based speech recognition

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links