System for continuous speech recognition through transition networks

US 4,888,823 A
Filed: 09/28/1987
Issued: 12/19/1989
Est. Priority Date: 09/29/1986
Status: Expired due to Fees

First Claim

Patent Images

1. A speech recognition method comprising the steps of:

extracting prescribed feature parameters from input signals for continuous speech sounds;

obtaining similarities by continuously matching the extracted feature parameters with a voice dictionary composed of phonetic segment units having prescribed phonetic meanings;

extracting sequences of phonetic segments to a prescribed placing of order based on the obtained similarities;

for each of standard words passing the phonetic segment sequence through presupplied transition networks; and

evaluating the word sequence passed through transition networks in accordance with similarities and obtaining recognition outputs.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Phoneme feature parameters are extracted from input digital speech signals by means of LPC analysis. Phonetic segments having phonetical meanings are obtained together with similarities to prescribed basic phonetic segments from the feature parameters to be passed through nodes of transition networks provided for each word. In passing the nodes, scores for similarity Sj of predetermined segments of the corresponding phonetic segments are made in selective scoring and the accumulation of the scores is used for recognition of continuous word speech.

29 Citations

View as Search Results

12 Claims

1. A speech recognition method comprising the steps of:
- extracting prescribed feature parameters from input signals for continuous speech sounds;
  
  obtaining similarities by continuously matching the extracted feature parameters with a voice dictionary composed of phonetic segment units having prescribed phonetic meanings;
  
  extracting sequences of phonetic segments to a prescribed placing of order based on the obtained similarities;
  
  for each of standard words passing the phonetic segment sequence through presupplied transition networks; and
  
  evaluating the word sequence passed through transition networks in accordance with similarities and obtaining recognition outputs.

2. A system for speech recognition comprising:
- a means for extracting prescribed feature parameters including a series of labels of phonetic segments of each word included in an input speech from input signals for continuous input speeches;
  
  a means for continuous matching of the extracted feature parameters with a voice dictionary compiled of phonetic segment units having prescribed phonetic meanings and for obtaining similarities on the phonetic segment units;
  
  a means for extracting a sequence of a plurality of phonetic segment likehoods up to a prescribed placing of order based on the similarities;
  
  a plurality of transition networks formed for each word by use of standard phonetic segment sequence;
  
  a means of passing the extracted phonetic segment likelihood sequence through the transition networks to perform word matchings; and
  
  ,a means for continuously combining the results of word matching to obtain recognition outputs;
  
  wherein said word matching means includes a means that for each word likelihood the normalized standard values for the similarities corresponding to phonetic segments in the transition networks and obtains an accumulated score by means of selective scoring of values of said labels.

3. A system for speech recognition comprising:
- means for extracting prescribed feature parameters including phonetic segment units and a label sequence composed of a series of labels each having its corresponding value and representing prescribed features of phonetic segment units from input continuous speech signals;
  
  means for obtaining similarities on the phonetic segment units extracted from input continuous speech signals by executing continuous matching of the extracted phonetic segment units with a voice dictionary compiled of the phonetic segment units having prescribed phonetic meanings so that a sequence of a plurality of similarities is obtained;
  
  means for converting the similarity of phonetic segment obtained by the similarity obtaining means into a normalized similarity having a normalized standard value;
  
  means for extracting a sequence of a plurality of phonetic segment likelihoods up to a prescribed placing of order based on the normalized standard values of the normalized similarities;
  
  means for selectively scoring the values of the labels with respect to predetermined phonetic segments except for transitional phonetic segments obtained in said feature parameter extracting means by accumulating the values of the labels;
  
  plurality of transition networks formed for each word included in the input speech by use of standard phonetic segment sequence;
  
  means for passing the extracted phonetic segment likelihood sequence through said transition networks by referring to a selectively scored value obtained in said scoring means for performing word-by-word matching; and
  
  means for continuously combining results of the word-by-word matching to obtain recognition outputs of the input speech.
- View Dependent Claims (4, 5, 6, 7, 8, 9, 10)
- - 4. A system according to claim 3, wherein the predetermined phonetic segments include continuant segments having a vowel steady part and a fricative consonant, consonant segments having transient parts to vowels, boundary segments expressing the boundary between a vowel and a semi-vowel, and devoiced vowel parts.
  - 5. A system according to claim 3, wherein said predetermined phonetic segments is prepared by excluding a transient part between first syllable and second syllable including the silent part.
  - 6. A system according to claim 3 inclusive, wherein the means for obtaining recognition outputs includes a means that for all the words adds up accumulated scores obtained for each word passed through the transition networks to obtain a total value, and a means of comparing the total value with the standard value.
  - 7. A system according to claim 3, wherein said extracting means includes an analyzing means employing mel-cepstrum as an analytic parameter in linear predictive coding analysis.
  - 8. A system according to claim 3, wherein the voice dictionary includes phonetic segments expressed in prescribed labels, and said similarity obtaining means includes means for matching the prescribed label sequence with labels stored in said voice dictionary.
  - 9. A system according to claim 3, wherein said converting means includes normalization constant tables.
  - 10. A system according to claim 9, wherein a plurality of normalization constant tables are provided for each phonetic segment.

11. A system for speech recognition comprising:
- means for extracting prescribed feature parameters including phonetic segment units and a label sequence composed of a series of labels, each having its corresponding value and representing prescribed features of phonetic segment units from input continuous speech signals;
  
  means for obtaining similarities on the phonetic segment units extracted from input continuous speech signals by executing continuous matching of the extracted phonetic segment units with a voice dictionary compiled of the phonetic segment units having prescribed phonetic meanings so that a sequence of a plurality of similarities is obtained;
  
  means for converting the similarity of phonetic segment obtained by the similarity obtaining means into a normalized similarity having a normalized standard value;
  
  means for extracting a sequence of a plurality of phonetic segment likelihoods up to a prescribed placing of order based on the normalized standard values of the normalized similarities;
  
  means for scoring the values of the labels with respect to phonetic segments except for transitional phonetic segments obtained in said feature parameter extracting means by accumulating the values of the labels;
  
  plurality of transition networks formed for each word included in the input speech by use of standard phonetic segment sequence;
  
  means for passing the extracted phonetic segment likelihood sequence through said transition networks by referring to a scored value obtained in said scoring means for performing word-by-word matching; and
  
  means for continuously combining results of the word-by-word matching to obtain recognition outputs of the input speech.

12. A system for speech recognition comprising:
- means for extracting prescribed feature parameters including phonetic segment units and a label sequence composed of a series of labels each having its corresponding value and representing prescribed features of phonetic segment units from input continuous speech signals;
  
  means for obtaining similarities of the phonetic segment units extracted from input continuous speech signals by executing continuous matching of the extracted phonetic segment units with a voice dictionary compiled of the phonetic segment units having prescribed phonetic meanings so that a sequence of a plurality of similarities is obtained;
  
  means for extracting a sequence of a plurality of phonetic segment likelihoods up to a prescribed placing of order based on the similarities;
  
  means for selectively scoring the values of the labels with respect to predetermined phonetic segments except for transitional phonetic segments obtained in said feature parameter extracting means by accumulating the values of the labels;
  
  plurality of transition networks formed for each word included in the input speech by use of standard phonetic segment sequence;
  
  means for passing the extracted phonetic segment likelihood sequence through said transition networks by referring to a selectively scored value obtained in said scoring means for performing word-by-word matching; and
  
  means for continuously combining results of the word-by-word matching to obtain recognition outputs of the input speech.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Kabushiki Kaisha Toshiba (Toshiba Corporation)
Original Assignee
Kabushiki Kaisha Toshiba (Toshiba Corporation)
Inventors
Uehara, Kensuke, Nitta, Tsuneo, Watanabe, Sadakazu
Primary Examiner(s)
Shoop, Jr., William M.
Assistant Examiner(s)
Hoff, Marc S.

Application Number

US07/101,789
Time in Patent Office

813 Days
Field of Search

381/41, 381/43, 381/45, 364/513.5
US Class Current

704/249
CPC Class Codes

G10L 15/142   Hidden Markov Models [HMMs]

G10L 15/187   Phonemic context, e.g. pron...

G10L 2015/025   Phonemes, fenemes or fenone...

System for continuous speech recognition through transition networks

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

29 Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

System for continuous speech recognition through transition networks

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

29 Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links