Method for recognizing non-standard and standard speech by speaker independent and speaker dependent word models

US 6,487,530 B1
Filed: 03/30/1999
Issued: 11/26/2002
Est. Priority Date: 03/30/1999
Status: Expired due to Fees

First Claim

Patent Images

1. In a speech recognition system comprising:

an incoming audio signal representing utterances of a user;

a stored set of first word models derived from utterances of a plurality of speakers; and

means for identifying a word in the utterances of a user upon matching portions of said audio signal with one of said stored first word models, a method of enhancing recognition of speech of said user comprising;

ascertaining a current context of the utterances of the user;

providing for said user a stored set of second word models, said set of second word models derived from words spoken by said user, said first word models and said second word models differing from each other;

attempting to identify words in the utterances of said user to find a match in the current context by comparing portions of said audio signal with one of a word model among said first word models and a word model among said second word models associated with said user, the attempting including determining a probability of whether the match exceeds a threshold; and

if the probability of the match fails to exceed the threshold, informing that the words fail to match any of the words acceptable in the current context and thereafter modifying, based on the words in the utterances of said user, the word model among the second word models associated with the user and without modifying the stored set of first word models.

View all claims

9 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method for speech recognition includes a speaker-independent set of stored word representations derived from speech of many users deemed to be typical speakers and for use by all users, and may further include speaker-dependent sets of stored word representations specific to each user. Utterances from a user which match stored words in either set according to the ordering rules are reported as words.

Citations

10 Claims

1. In a speech recognition system comprising:
- an incoming audio signal representing utterances of a user;
  
  a stored set of first word models derived from utterances of a plurality of speakers; and
  
  means for identifying a word in the utterances of a user upon matching portions of said audio signal with one of said stored first word models, a method of enhancing recognition of speech of said user comprising;
  
  ascertaining a current context of the utterances of the user;
  
  providing for said user a stored set of second word models, said set of second word models derived from words spoken by said user, said first word models and said second word models differing from each other;
  
  attempting to identify words in the utterances of said user to find a match in the current context by comparing portions of said audio signal with one of a word model among said first word models and a word model among said second word models associated with said user, the attempting including determining a probability of whether the match exceeds a threshold; and
  
  if the probability of the match fails to exceed the threshold, informing that the words fail to match any of the words acceptable in the current context and thereafter modifying, based on the words in the utterances of said user, the word model among the second word models associated with the user and without modifying the stored set of first word models.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method according to claim 1 further including:
3. The method according to claim 2 wherein said set of second word models is stored in a separate memory location from said set of first word models.
4. The method according to claim 1 further including:
- inviting a user to speak training utterances of a word upon a predetermined number of failures to identify the word among said first word models when no model for the word is present in said second models;
  
  deriving a word model from said training utterances; and
  
  storing the derived word model in said set of second word models.
5. The method according to claim 4 wherein said set of second word models is stored in a separate memory location from said set of first word models.

6. A method of enhancing speech recognition comprising:
- providing a set of user-independent word models derived from utterances of a plurality of speakers;
  
  providing a set of user-dependent word models for ones of a plurality of users each of the user-dependent word models being derived from utterances of an associated one of said users, said user-independent word models and said user-dependent word models differing from each other;
  
  ascertaining a current context of the utterances of the user;
  
  attempting to match an utterance from one of said users to one of said user-independent word models to find a possible match in the current context; and
  
  attempting to match another utterance from said one of said users to one of said user-dependent word models to find a further match in the current context, determining probabilities of whether the possible match and the further match exceed a threshold; and
  
  if the probabilities of the possible match and the further match fail to exceed the threshold, informing that the words fail to match any of the words acceptable in the current context and thereafter modifying, based on the words in the utterances of said user, the user-dependent word models and without modifying the provided set of user-independent word models.
- View Dependent Claims (7, 8, 9, 10)
- - 7. The method according to claim 6 further including:
8. The method according to claim 7 wherein said user-dependent word models are stored in a separate memory location from said user-independent word models.
9. The method according to claim 6 further including:
- inviting a new user to speak training utterances of a word upon a predetermined number of failures to identify the word among said user-independent word models when no model for the word is present in said user-dependent models;
  
  deriving a word model from said training utterances; and
  
  storing the derived word model in said set of user-dependent word models.
10. The method according to claim 9 wherein said user-dependent word models are stored in a separate memory location from said user-independent word models.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Avaya Incorporated
Original Assignee
Nortel Networks Limited (Nortel Networks Corporation)
Inventors
Lin, Lin, Lin, Ping
Primary Examiner(s)
Banks-Harold, Marsha D.
Assistant Examiner(s)
Storm, Donald L.

Application Number

US09/281,078
Time in Patent Office

1,337 Days
Field of Search

704/244, 704/251, 379/88.01, 379/88.03
US Class Current

704/244
CPC Class Codes

G10L 15/065 Adaptation

Method for recognizing non-standard and standard speech by speaker independent and speaker dependent word models

First Claim

9 Assignments

0 Petitions

Accused Products

Abstract

Citations

10 Claims

Specification

Solutions

Use Cases

Quick Links

Method for recognizing non-standard and standard speech by speaker independent and speaker dependent word models

First Claim

9 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

10 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links