Speech recognition system with improved rejection of words and sounds not in the system vocabulary

US 5,465,317 A
Filed: 05/18/1993
Issued: 11/07/1995
Est. Priority Date: 05/18/1993
Status: Expired due to Fees

First Claim

Patent Images

1. A speech recognition apparatus comprising:

an acoustic processor for measuring the value of at least one feature of each of a sequence of at least two sounds, said acoustic processor measuring the value of the feature of each sound during each of a series of successive time intervals to produce a series of feature signals representing the feature values of the sound;

means for storing a set of acoustic command models, each acoustic command model representing one or more series of acoustic feature values representing an utterance of a command associated with the acoustic command model.a match score processor for generating a match score for each sound and each of one or more acoustic command models from the set of acoustic command models, each match score comprising an estimate of the closeness of a match between the acoustic command model and a series of feature signals corresponding to the sound; and

means for outputting a recognition signal corresponding to the acoustic command model having a best match score for a current sound if the best match score for the current sound is greater than a recognition threshold score for the current sound, the recognition threshold score for the current sound is equal to (a) a first confidence score if the best match score for a prior sound was greater than a recognition threshold for the prior sound, or (b) a second confidence score greater than the first confidence score if the best match score for the prior sound was less than the recognition threshold for the prior sound.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech recognizer that selects a command model for a current sound if the best match score for the current sound exceeds its corresponding threshold score. The threshold score is assigned a confidence score based on the best match score and recognition threshold of a prior sound. When the best match score for the current sound exceeds a "poor" confidence score but is less than a "good" confidence score: (a) the word corresponding to the acoustic model having the best match score is accepted as highly likely to correspond to the measured sound if the previously recognized word was accepted as having a high likelihood of corresponding to the previous sound; (b) the word corresponding to the acoustic model having the best match score is rejected as highly unlikely to correspond to the measured sound if the previously recognized word was rejected as having a low likelihood of corresponding to the previous sound; or (c) if there is sufficient intervening silence between a previously rejected word and the current word, then the current word is also accepted as having a high likelihood of corresponding to the measured current sound.

101 Citations

View as Search Results

20 Claims

1. A speech recognition apparatus comprising:
- an acoustic processor for measuring the value of at least one feature of each of a sequence of at least two sounds, said acoustic processor measuring the value of the feature of each sound during each of a series of successive time intervals to produce a series of feature signals representing the feature values of the sound;
  
  means for storing a set of acoustic command models, each acoustic command model representing one or more series of acoustic feature values representing an utterance of a command associated with the acoustic command model.a match score processor for generating a match score for each sound and each of one or more acoustic command models from the set of acoustic command models, each match score comprising an estimate of the closeness of a match between the acoustic command model and a series of feature signals corresponding to the sound; and
  
  means for outputting a recognition signal corresponding to the acoustic command model having a best match score for a current sound if the best match score for the current sound is greater than a recognition threshold score for the current sound, the recognition threshold score for the current sound is equal to (a) a first confidence score if the best match score for a prior sound was greater than a recognition threshold for the prior sound, or (b) a second confidence score greater than the first confidence score if the best match score for the prior sound was less than the recognition threshold for the prior sound.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. A speech recognition apparatus as claimed in claim 1, characterized in that the prior sound is contiguous with the current sound.
  - 3. A speech recognition apparatus as claimed in claim 2, characterized in that:
    - the apparatus further comprises means for storing at least one acoustic silence model representing one or more series of acoustic feature values representing the absence of a spoken utterance;
      
      the match score processor generates a match score for each sound and the acoustic silence model, each match score comprising an estimate of the closeness of a match between the acoustic silence model and a series of feature signals corresponding to the sound; and
      
      the recognition threshold score for the current sound is equal to the first confidence score (a1) if the match score for the prior sound and the acoustic silence model is greater than a silence match threshold, and if the prior sound has a duration exceeding a silence duration threshold, or (a2) if the match score for the prior sound and the acoustic silence model is greater than the silence match threshold, and if the prior sound has a duration less than the silence duration threshold, and if the best match score for a second prior sound and an acoustic command model was greater than a recognition threshold for the second prior sound, or (a3) if the match score for the prior sound and the acoustic silence model is less than the silence match threshold, and if the best match score for the prior sound and an acoustic command model was greater than a recognition threshold for the prior sound;
      
      orthe recognition threshold for the current sound is equal to the second confidence score better than the first confidence score (b1) if the match score for the prior sound and the acoustic silence model is greater than the silence match threshold, and if the prior sound has a duration less than the silence duration threshold, and if the best match score for the second prior sound and an acoustic command model was less than the recognition threshold for the second prior sound, or (b2) if the match score for the prior sound and the acoustic silence model is less than the silence match threshold, and f the best match score for the prior sound and an acoustic command model was less than the recognition threshold for the prior sound.
  - 4. A speech recognition apparatus as claimed in claim 3, characterized in that the recognition signal comprises a command signal for calling a program associated with the command.
  - 5. A speech recognition apparatus as claimed in claim 4, characterized in that:
    - the output means comprises a display; and
      
      the output means displays one or more words corresponding to the command model having the best match score for a current sound if the best match score for the current sound is better than the recognition threshold score for the current sound.
  - 6. A speech recognition apparatus as claimed in claim 5, characterized in that the output means outputs an unrecognizable-sound indication signal if the best match score for the current sound is worse than the recognition threshold score for the current sound.
  - 7. A speech recognition apparatus as claimed in claim 6, characterized in that the output means displays an unrecognizable-sound indicator if the best match score for the current sound is worse than the recognition threshold score for the current sound.
  - 8. A speech recognition apparatus as claimed in claim 7, characterized in that unrecognizable-sound indicator comprises one or more question marks.
  - 9. A speech recognition apparatus as claimed in claim 1, characterized in that the acoustic processor comprises a microphone.
  - 10. A speech recognition apparatus as claimed in claim 1, characterized in that:
    - each sound comprises a vocal sound; and
      
      each command model comprises at least one word.
  - 11. A speech recognition apparatus as claimed in claim 1, characterized in that the acoustic processor is adapted to measure the value of at least one feature of each of a sequence of at least three sounds, wherein the first prior sound is contiguous with the current sound.

12. A speech recognition method comprising the steps of:
- measuring the value of at least one feature of each of a sequence of at least two sounds, the value of the feature of each sound being measured during each of a series of successive time intervals to produce a series of feature signals representing the feature values of the sound;
  
  storing a set of acoustic command models, each acoustic command model representing one or more series of acoustic feature values representing an utterance of a command associated with the acoustic command model;
  
  generating a match score for each sound and each of one or more acoustic command models from the set of acoustic command models, each match score comprising an estimate of the closeness of a match between the acoustic command model and a series of feature signals corresponding to the sound; and
  
  outputting a recognition signal corresponding to the acoustic command model having a best match score for a current sound if the best match score for the current sound is greater than a recognition threshold score for the current sound, the recognition threshold score for the current sound is equal to a first confidence score if the best match score for a prior sound was greater than a recognition threshold for the prior sound, or (b) a second confidence score greater than the first confidence score if the best match score for the prior sound was less than the recognition threshold for the prior sound.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20)
- - 13. A speech recognition method as claimed in claim 12, characterized in that the prior sound is contiguous with the current sound.
  - 14. A speech recognition method as claimed in claim 13, further comprising the steps of:
    - storing at least one acoustic silence model representing one or more series of acoustic feature values representing the absence of a spoken utterance;
      
      generating a match score for each sound and the acoustic silence model, each match score comprising an estimate of the closeness of a match between the acoustic silence model and a series of feature signals corresponding to the sound; and
      
      characterized in that the recognition threshold score for the current sound is equal to the first confidence score (a1) if the match score for the prior sound and the acoustic silence model is greater than a silence match threshold, and if the prior sound has a duration exceeding a silence duration threshold, or (a2) if the match score for the prior sound and the acoustic silence model is greater than the silence match threshold, and if the prior sound has a duration less than the silence duration threshold, and if the best match score for a second prior sound and an acoustic command model was greater than a recognition threshold for the second prior sound, or (a3) if the match score for the prior sound and the acoustic silence model is less than the silence match threshold, and if the best match score for the prior sound and an acoustic command model was greater than a recognition threshold for the prior sound;
      
      orthe recognition threshold for the current sound is equal to the second confidence score better than the first confidence score (b1) if the match score for the prior sound and the acoustic silence model is greater than the silence match threshold, and if the prior sound has a duration less than the silence duration threshold, and if the best match score for the second prior sound and an acoustic command model was less than the recognition threshold for the second prior sound, or (b2) if the match score for the prior sound and the acoustic silence model is less than the silence match threshold, and if the best match score for the first prior sound and an acoustic command model was less than the recognition threshold for the prior sound.
  - 15. A speech recognition method as claimed in claim 14, characterized in that the recognition signal comprises a command signal for calling a program associated with the command.
  - 16. A speech recognition method as claimed in claim 15, further comprising the step of displaying one or more words corresponding to the command model having the best match score for a current sound if the best match score for the current sound is better than the recognition threshold score for the current sound.
  - 17. A speech recognition method as claimed in claim 16, further comprising the step of outputting an unrecognizable-sound indication signal if the best match score for the current sound is worse than the recognition threshold score for the current sound.
  - 18. A speech recognition method as claimed in claim 17, further comprising the step of displaying an unrecognizable-sound indicator if the best match score for the current sound is worse than the recognition threshold score for the current sound.
  - 19. A speech recognition method as claimed in claim 18, characterized in that unrecognizable-sound indicator comprises one or more question marks.
  - 20. A speech recognition method as claimed in claim 12, characterized in that:
    - each sound comprises a vocal sound; and
      
      each command model comprises at least one word.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Epstein, Edward A.
Primary Examiner(s)
MacDonald, Allen R.
Assistant Examiner(s)
ONKA, THOMAS

Application Number

US08/062,972
Time in Patent Office

903 Days
Field of Search

395/2.42, 395/2.45, 395/2.47, 395/2.6, 395/2.65
US Class Current

704/236
CPC Class Codes

G10L 25/78 Detection of presence or ab...

Speech recognition system with improved rejection of words and sounds not in the system vocabulary

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

101 Citations

20 Claims

Specification

Use Cases

Quick Links

Others

Speech recognition system with improved rejection of words and sounds not in the system vocabulary

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

101 Citations

20 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others