Method and apparatus for generating models of spoken words based on a small number of utterances

US 5,293,451 A
Filed: 10/23/1990
Issued: 03/08/1994
Est. Priority Date: 10/23/1990
Status: Expired due to Fees

First Claim

Patent Images

1. A method of modeling a word uttered at least two times, each utterance having at least one acoustic feature having a value, said method comprising the steps of:

measuring the value of the acoustic feature of each utterance;

storing a selection set of one or more probabilistic word model signals, each probabilistic word model signal in the selection set representing a probabilistic model of the word;

calculating, for the selection set, a match score representing the closeness of a match between the probabilistic word models in the selection set and the value of the acoustic feature of each utterance;

storing a candidate set of one or more probabilistic word model signals, each probabilistic word model signal in the candidate set representing a probabilistic model of the word, each probabilistic word model in the candidate set being different from each probabilistic word model in the selection set;

storing an expansion set comprising the probabilistic word model signals in the selection set and one probabilistic word model signal from the candidate set;

calculating, for the expansion set, a match score representing the closeness of a match between the probabilistic word models in the expansion set and the value of the acoustic feature of each utterance; and

modeling the word with the word models in the expansion set if the expansion set match score surpasses the selection set match score by a nonzero threshold value.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and apparatus for modeling words based on match scores representing (a) the closeness of a match between probabilistic word models and the acoustic features of at least two utterances, and (b) the closeness of a match between word models and the spelling of the word. A match score is calculated for a selection set of one or more probabilistic word models. A match score is also calculated for an expansion set comprising the probabilistic word models in the selection set and one probabilistic word model from a candidate set. If the expansion set match score improves the selection set match score by a selected nonzero threshold value, the word is modelled with the word models in the expansion set. If the expansion set match score does not improve the selection set match score by the selected nonzero threshold value, the word is modelled with the words in the selection set.

31 Citations

View as Search Results

16 Claims

1. A method of modeling a word uttered at least two times, each utterance having at least one acoustic feature having a value, said method comprising the steps of:
- measuring the value of the acoustic feature of each utterance;
  
  storing a selection set of one or more probabilistic word model signals, each probabilistic word model signal in the selection set representing a probabilistic model of the word;
  
  calculating, for the selection set, a match score representing the closeness of a match between the probabilistic word models in the selection set and the value of the acoustic feature of each utterance;
  
  storing a candidate set of one or more probabilistic word model signals, each probabilistic word model signal in the candidate set representing a probabilistic model of the word, each probabilistic word model in the candidate set being different from each probabilistic word model in the selection set;
  
  storing an expansion set comprising the probabilistic word model signals in the selection set and one probabilistic word model signal from the candidate set;
  
  calculating, for the expansion set, a match score representing the closeness of a match between the probabilistic word models in the expansion set and the value of the acoustic feature of each utterance; and
  
  modeling the word with the word models in the expansion set if the expansion set match score surpasses the selection set match score by a nonzero threshold value.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. A method as claimed in claim 1, further comprising the step of modeling the word with the word models in the selection set if the expansion set match score does not surpass the selection set match by the nonzero threshold value.
  - 3. A method as claimed in claim 1, characterized in that the word has a spelling, the method further comprises the step of storing a spelling signal representing the spelling of the word, and each set match score represents a weighted combination of:
    - the closeness of a match between the probabilistic word models in the set of models and the values of the acoustic feature of the utterances; and
      
      the closeness of a match between the probabilistic word models in the set of models and the spelling of the word.
  - 4. A method as claimed in claim 3, characterized in that each set match score is calculated by the steps of:
    - calculating, for each probabilistic word model in the set and for each utterance, a match score representing a weighted combination of (a) the closeness of a match between the probabilistic word model and the value of the acoustic feature of each utterance, and (b) the closeness of a match between the probabilistic word model and the spelling of the word;
      
      identifying, for each utterance, a best-of-set match score representing the best match score between the utterance and the probabilistic word models in the set;
      
      calculating a set match score representing the average best-of-set match score for the probabilistic word models and all the utterances.
  - 5. A method as claimed in claim 4, further comprising the steps of:
    - calculating, for each probabilistic word model in the candidate set, a joint match score representing a weighted combination of (a) the closeness of a match between a joint set of the candidate probabilistic word model and the probabilistic word models in the selection set and the value of the acoustic feature of each utterance, and (b) the closeness of a match between the joint set of probabilistic word models and the spelling of the word; and
      
      choosing as the expansion set the joint set having the best joint match score.
  - 6. A method as claimed in claim 1, characterized in that the selection set consists of one probabilistic word model having a set match score better than the match score of any one probabilistic word model in the candidate set.

7. A method of modeling words, said method comprising the steps of:
- measuring the value of at least one feature of a first utterance of a word during each of a series of successive time intervals to produce a first series of feature vector signals representing the feature values of the first utterance;
  
  measuring the value of at least one feature of a second utterance of the same word during each of a series of successive time intervals to produce a second series of feature vector signals representing the feature values of the second utterance;
  
  storing two or more probabilistic word model signals, each probabilistic word model signal representing a probabilistic model of the word;
  
  calculating, for each probabilistic word model and for each utterance, a match score representing the closeness of a match between the probabilistic word model and the series of feature vector signals produced by the utterance;
  
  calculating, for each probabilistic word model, an average-model match score representing the average match score for the word model and all utterances;
  
  selecting a first probabilistic word model having the best average-model match score;
  
  selecting a second probabilistic word model;
  
  identifying, for each utterance, a best-of-set match score representing the best match score between the utterance and the first and second probabilistic word models;
  
  calculating a set-average match score representing the average best-of-set match score for the first and second probabilistic word models and all utterances; and
  
  modeling the word with both the first and second probabilistic word models if the set-average match score surpasses the best average-model match score by a nonzero threshold value.
- View Dependent Claims (8, 9)
- - 8. A method as claimed in claim 7, further comprising the step of modeling the word with the first probabilistic word model but not with the second probabilistic word model if the set-average match score does not surpass the best average-model match score by the nonzero threshold value.
  - 9. A method as claimed in claim 8, characterized in that the word has a spelling, and each match score represents a weighted combination of:
    - the closeness of a match between a probabilistic word model and the value of the acoustic feature of the utterances; and
      
      the closeness of a match between the probabilistic word model and the spelling of the word.

10. An apparatus for modeling words, said apparatus comprising:
- means for measuring the value of at least one acoustic feature of each of at least two utterances of a word;
  
  means for storing a selection set of one or more probabilistic word model signals, each probabilistic word model signal in the selection set representing a probabilistic model of the word;
  
  means for calculating, for the selection set, a match score representing the closeness of a match between the probabilistic word models in the selection set and the value of the acoustic feature of each utterance;
  
  means for storing a candidate set of one or more probabilistic word model signals, each probabilistic word model signal in the candidate set representing a probabilistic model of the word, each probabilistic word model in the candidate set being different from each probabilistic word model in the selection set;
  
  means for storing an expansion set comprising the probabilistic word model signals in the selection set and one probabilistic word model signal from the candidate set;
  
  means for calculating, for the expansion set, a match score representing the closeness of a match between the probabilistic word models in the expansion set and the value of the acoustic feature of each utterance; and
  
  means for modeling the word with the word models in the expansion set if the expansion set match score surpasses the selection set match score by a nonzero threshold value.
- View Dependent Claims (11, 12, 13, 14, 15, 16)
- - 11. An apparatus as claimed in claim 10, further comprising means for modeling the word with the word models in the selection set if the expansion set match score does not surpass the selection set match score by the nonzero threshold value.
  - 12. An apparatus as claimed in claim 11, characterized in that the word has a spelling, the apparatus further comprises means for storing a spelling signal representing the spelling of the word, and each set match score represents a weighted combination of:
    - the closeness of a match between the probabilistic word models in the set of models and the values of the acoustic feature of the utterances; and
      
      the closeness of a match between the probabilistic word models in the set of models and the spelling of the word.
  - 13. An apparatus as claimed in claim 12, characterized in that the means for calculating each set match score comprises:
    - means for calculating, for each probabilistic word model in the set and for each utterance, a match score representing a weighted combination of (a) the closeness of a match between the probabilistic word model and the value of the acoustic feature of each utterance, and (b) the closeness of a match between the probabilistic word model and the spelling of the word;
      
      means for identifying, for each utterance, a best-of-set match score representing the best match score between the utterance and the probabilistic word models in the set;
      
      means for calculating a set match score representing the average best-of-set match score for the probabilistic word models and all the utterances.
  - 14. An apparatus as claimed in claim 13, further comprising:
    - means for calculating, for each probabilistic word model in the candidate set, a joint match score representing a weighted combination of (a) the closeness of a match between a joint set of the candidate probabilistic word model and the probabilistic word models in the selection set and the value of the acoustic feature of each utterance, and (b) the closeness of a match between the joint set of probabilistic word models and the spelling of the word; and
      
      means for selecting as the expansion set the joint set having the best joint match score.
  - 15. An apparatus as claimed in claim 10, characterized in that the selection set consists of one probabilistic word model having a match score better than the match score of any one probabilistic word model in the candidate set.
  - 16. An apparatus as claimed in claim 10, characterized in that the measuring means comprises a microphone for converting the utterances of the word into analog electrical signals.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
De Gennaro, Steven V., Desouza, Peter V., Epstein, Mark E., Brown, Peter F.
Primary Examiner(s)
Fleming, Michael R.
Assistant Examiner(s)
Doerrler, Michelle

Application Number

US07/602,020
Time in Patent Office

1,232 Days
Field of Search

395/2.54, 395/2, 381/29-53
US Class Current

704/245
CPC Class Codes

G10L 15/063 Training

G10L 15/144 Training of HMMs

Method and apparatus for generating models of spoken words based on a small number of utterances

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

31 Citations

16 Claims

Specification

Use Cases

Quick Links

Others

Method and apparatus for generating models of spoken words based on a small number of utterances

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

31 Citations

16 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others