Speech recognition apparatus having a speech coder outputting acoustic prototype ranks

US 5,222,146 A
Filed: 10/23/1991
Issued: 06/22/1993
Est. Priority Date: 10/23/1991
Status: Expired due to Term

First Claim

Patent Images

1. A speech coding apparatus comprising:

means for measuring the value of at least one feature of an utterance over each of a series of successive time intervals to produce a series of feature vector signals representing the feature values;

means for storing a plurality of prototype vector signals, each prototype vector signal having at least one parameter value and having a unique identification value;

means for comparing the closeness of the feature value of a first feature vector signal to the parameter values of the prototype vector signals to obtain prototype match scores for the first feature vector signal and each prototype vector signal;

ranking means for associating a first-rank score with the prototype vector signal having the best prototype match score, and for associating a second-rank score with the prototype vector signal having the second best prototype match score; and

means for outputting at least the identification value and the rank score of the prototype vector signal having the first-rank score, and the identification value and the rank score of the prototype vector signal having the second-rank score, as a coded utterance representation signal of the first feature vector signal.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech coding and speech recognition apparatus. The value of at least one feature of an utterance is measured over each of a series of successive time intervals to produce a series of feature vector signals. The closeness of the feature value of each feature vector signal to the parameter value of each of a set of prototype vector signals is determined to obtain prototype match scores for each vector signal and each prototype vector signal. For each feature vector signal, first-rank and second-rank scores are associated with the prototype vector signals having the best and second best prototype match scores, respectively. For each feature vector signal, at least the identification value and the rank score of the first-ranked and second-ranked prototype vector signals are output as a coded utterance representation signal of the feature vector signal, to produce a series of coded utterance representation signals. For each of a plurality of speech units, a probabilistic model has a plurality of model outputs, and output probabilities for each model output. Each model output comprises the identification value of a prototype vector and a rank score. For each speech unit, a match score comprises an estimate of the probability that the probabilistic model of the speech unit would output a series of model outputs matching a reference series comprising the identification value and rank score of at least one prototype vector from each coded utterance representation signal in the series of coded utterance representation signals.

276 Citations

27 Claims

1. A speech coding apparatus comprising:
- means for measuring the value of at least one feature of an utterance over each of a series of successive time intervals to produce a series of feature vector signals representing the feature values;
  
  means for storing a plurality of prototype vector signals, each prototype vector signal having at least one parameter value and having a unique identification value;
  
  means for comparing the closeness of the feature value of a first feature vector signal to the parameter values of the prototype vector signals to obtain prototype match scores for the first feature vector signal and each prototype vector signal;
  
  ranking means for associating a first-rank score with the prototype vector signal having the best prototype match score, and for associating a second-rank score with the prototype vector signal having the second best prototype match score; and
  
  means for outputting at least the identification value and the rank score of the prototype vector signal having the first-rank score, and the identification value and the rank score of the prototype vector signal having the second-rank score, as a coded utterance representation signal of the first feature vector signal.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. A speech coding apparatus as claimed in claim 1, characterized in that:
    - the ranking means comprises means for ranking all of the prototype match scores for the first feature vector signal from highest to lowest and for associating a rank score with each prototype match score, each rank score representing the estimated closeness of the associated prototype vector signal to the first feature vector signal relative to the estimated closeness of all other prototype vector signals to the first feature vector signal; and
      
      the outputting means comprises means for outputting the identification value of each prototype vector signal and the rank score of each prototype vector signal as a coded utterance representation signal of the first feature vector signal.
  - 3. A speech coding apparatus as claimed in claim 2, further comprising means for storing the coded utterance representation signal of the feature vector signal.
  - 4. A speech coding apparatus as claimed in claim 3, characterized in that the rank score for a selected prototype vector signal for a given feature vector signal is monotonically related to the number of other prototype vector signals having prototype match scores better than the prototype match score of the selected prototype vector signal for the given feature vector signal.
  - 5. A speech coding apparatus as claimed in claim 4, characterized in that the means for storing prototype vector signals comprises electronic read/write memory.
  - 6. A speech coding apparatus as claimed in claim 5, characterized in that the measuring means comprises a microphone.

7. A speech coding method comprising:
- measuring the value of at least one feature of an utterance over each of a series of successive time intervals to produce a series of feature vector signals representing the feature values;
  
  storing a plurality of prototype vector signals, each prototype vector signal having at least one parameter value and having a unique identification value;
  
  comparing the closeness of the feature value of a first feature vector signal to the parameter values of the prototype vector signals to obtain prototype match scores for the first feature vector signal and each prototype vector signal;
  
  ranking the prototype vector signal having the best prototype match score with a first-rank score, and ranking the prototype vector signal having the second best prototype match score with a second-rank score; and
  
  outputting at least the identification value and the rank score of the prototype vector signal having the first-rank score, and the identification value and the rank score of the prototype vector signal having the second-rank score, as a coded utterance representation signal of the first feature vector signal.
- View Dependent Claims (8, 9, 10)
- - 8. A speech coding method as claimed in claim 7, characterized in that:
    - the step of ranking comprises ranking all of the prototype match scores for the first feature vector signal from highest to lowest and for associating a rank score with each prototype match score, each rank score representing the estimated closeness of the associated prototype vector signal to the first feature vector signal relative to the estimated closeness of all other prototype vector signals to the first feature vector signal; and
      
      the step of outputting comprises outputting the identification value of each prototype vector signal and the rank score of each prototype vector signal as a coded utterance representation signal of the first feature vector signal.
  - 9. A speech coding method as claimed in claim 8, further comprising the step of storing the coded utterance representation signals of all of the feature vector signals.
  - 10. A speech coding method as claimed in claim 9, characterized in that the rank score for a selected prototype vector signal for a given feature vector signal is monotonically related to the number of other prototype vector signals having prototype match scores better than the prototype match score of the selected prototype vector signal for the given feature vector signal.

11. A speech recognition apparatus comprising:
- means for measuring the value of at least one feature of an utterance over each of a series of successive time intervals to produce a series of feature vector signals representing the feature values;
  
  means for storing a plurality of prototype vector signals, each prototype vector signal having at least one parameter value and having a unique identification value;
  
  means for comparing the closeness of the feature value of each feature vector signal to the parameter values of the prototype vector signals to obtain prototype match scores for each feature vector signal and each prototype vector signal;
  
  ranking means for associating, for each feature vector signal, a first-rank score with the prototype vector signal having the best prototype match score, and a second-rank score with the prototype vector signal having the second best prototype match score;
  
  means for outputting, for each feature vector signal, at least the identification value and the rank score of the prototype vector signal having the first-rank score, and the identification value and the rank score of the prototype vector signal having the second-rank score, as a coded utterance representation signal of the feature vector signal, to produce a series of coded utterance representation signals;
  
  means for storing probabilistic models for a plurality of speech units, at least a first model for a first speech unit having (a) at least two states, (b) at least one transition extending from a state to the same or another state, (c) a transition probability for each transition, (d) a plurality of model outputs for at least one prototype vector at a transition, each model output comprising the identification value of the prototype vector and a rank score, and (e) output probabilities at a transition for each model output;
  
  means for generating a match score for each of a plurality of speech units, each match score comprising an estimate of the probability that the probabilistic model of the speech unit would output a series of model outputs matching a reference series comprising the identification value and rank score of at least one prototype vector from each coded utterance representation signal in the series of coded utterance representation signals;
  
  means for identifying one or more best candidate speech units having the best match scores; and
  
  means for outputting at least one speech subunit of one or more of the best candidate speech units.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
- - 12. A speech recognition apparatus as claimed in claim 11, characterized in that:
    - the ranking means comprises means for associating a rank score with all prototype vector signals for each feature vector signal, each rank score representing the estimated closeness of the associated prototype vector signal to the feature vector signal relative to the estimated closeness of all other prototype vector signals to the feature vector signal; and
      
      the outputting means comprises means for outputting for each feature vector signal the identification values and the rank scores of the prototype vector signals as a coded utterance representation signal of the feature vector signal, to produce a series of coded utterance representation signals.
  - 13. A speech recognition apparatus as claimed in claim 12, characterized in that the rank score for a selected prototype vector signal for a given feature vector signal is monotonically related to the number of other prototype vector signals having prototype match scores better than the prototype match score of the selected prototype vector signal for the given feature vector signal.
  - 14. A speech recognition apparatus as claimed in claim 11, characterized in that each match score further comprises an estimate of the probability of occurrence of the speech unit.
  - 15. A speech recognition apparatus as claimed in claim 14, characterized in that the means for storing prototype vector signals comprises electronic read/write memory.
  - 16. A speech recognition apparatus as claimed in claim 15, characterized in that the measuring means comprises a microphone.
  - 17. A speech recognition apparatus as claimed in claim 16, characterized in that the speech subunit output means comprises a video display.
  - 18. A speech recognition apparatus as claimed in claim 17, characterized in that the video display comprises a cathode ray tube.
  - 19. A speech recognition apparatus as claimed in claim 17, characterized in that the video display comprises a liquid crystal display.
  - 20. A speech recognition apparatus as claimed in claim 17, characterized in that the video display comprises a printer.
  - 21. A speech recognition apparatus as claimed in claim 16, characterized in that the speech subunit output means comprises an audio generator.
  - 22. A speech recognition apparatus as claimed in claim 21, characterized in that the audio generator comprises a loudspeaker.
  - 23. A speech recognition apparatus as claimed in claim 21, characterized in that the audio generator comprises a headphone.

24. A speech recognition method comprising:
- measuring the value of at least one feature of an utterance over each of a series of successive time intervals to produce a series of feature vector signals representing the feature values;
  
  storing a plurality of prototype vector signals, each prototype vector signal having at least one parameter value and having a unique identification value;
  
  comparing the closeness of the feature value of each feature vector signal to the parameter values of the prototype vector signals to obtain prototype match scores for each feature vector signal and each prototype vector signal;
  
  ranking, for each feature vector signal, the prototype vector signal having the best prototype match score with a first-rank score, and the prototype vector signal having the second best prototype match score with a second-rank score;
  
  outputting, for each feature vector signal, at least the identification value and the rank score of the prototype vector signal having the first-rank score, and the identification value and the rank score of the prototype vector signal having the second-rank score, as a coded utterance representation signal of the feature vector signal, to produce a series of coded utterance representation signals;
  
  storing probabilistic models for a plurality of speech units, at least a first model for a first speech unit having (a) at least two states, (b) at least one transition extending from a state to the same or another state, (c) a transition probability for each transition, (d) a plurality of model outputs for at least one prototype vector at a transition, each model output comprising the identification value of the prototype vector and a rank score, (e) output probabilities at a transition for each model output;
  
  generating a match score for each of a plurality of speech units, each match score comprising an estimate of the probability that the probabilistic model of the speech unit would output a series of model outputs matching a reference series comprising the identification value and rank score of at least one prototype vector from each coded utterance representation signal in the series of coded utterance representation signals;
  
  identifying one or more best candidate speech units having the best match scores; and
  
  outputting at least one speech subunit of one or more of the best candidate speech units.
- View Dependent Claims (25, 26, 27)
- - 25. A speech recognition method as claimed in claim 24, characterized in that:
    - the step of ranking comprises associating a rank score with all prototype vector signals for each feature vector signal, each rank score representing the estimated closeness of the associated prototype vector signal to the feature vector signal relative to the estimated closeness of all other prototype vector signals to the feature vector signal; and
      
      the step of outputting comprises outputting for each feature vector signal the identification values and the rank scores of the prototype vector signals as a coded utterance representation signal of the feature vector signal, to produce a series of coded utterance representation signals.
  - 26. A speech recognition method as claimed in claim 25, characterized in that the rank score for a selected prototype vector signal for a given feature vector signal is monotonically related to the number of other prototype vector signals having prototype match scores better than the prototype match score of the selected prototype vector signal for the given feature vector signal.
  - 27. A speech recognition method as claimed in claim 24, characterized in that each match score further comprises an estimate of the probability of occurrence of the speech unit.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Nuance Communications, Inc. (Microsoft Corporation)
Original Assignee
International Business Machines Corporation
Inventors
Gopalakrishnan, Ponani S., De Souza, Peter V., Bahl, Latit R., Picheny, Michael A.
Primary Examiner(s)
Shaw, Dale M.
Assistant Examiner(s)
Tung, Kee M.

Application Number

US07/781,440
Time in Patent Office

608 Days
Field of Search

381/30, 381/31, 381/36, 381/41, 381/43, 381/44, 381/52
US Class Current

704/243
CPC Class Codes

G10L 15/02 Feature extraction for spee...

G10L 19/0018 Speech coding using phoneti...

Speech recognition apparatus having a speech coder outputting acoustic prototype ranks

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

276 Citations

27 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition apparatus having a speech coder outputting acoustic prototype ranks

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

276 Citations

27 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links