Speech recognition system and method permitting user customization
First Claim
Patent Images
1. A speech recognition system comprising computer memory storing:
- a first set of speaker-independent word models used to match a word in an utterance of a user with a word model in said first set, wherein said first set of word models includes models for each of a plurality of words;
a second set of speaker dependent word models derived from speech of a particular user and used to match a word in an utterance of said particular user, wherein said second set of word models includes models for at least some of said plurality of words; and
a program portion used to identify words in utterances of said particular user by attempting to match portions of an audio signal with;
word models among said first set; and
word models among said second set, wherein said identified words in the utterances of said particular user include user-selected words for invoking commands.
6 Assignments
0 Petitions
Accused Products
Abstract
A system and method for speech recognition includes a speaker-independent set of stored word representations derived from speech of many users deemed to be typical speakers and for use by all users, and may further include speaker-dependent sets of stored word representations specific to each user. The speaker-dependent sets may be used to store custom commands, so that a user may replace default commands to customize and simplify use of the system. Utterances from a user which match stored words in either set according to the ordering rules are reported as words.
58 Citations
9 Claims
-
1. A speech recognition system comprising computer memory storing:
-
a first set of speaker-independent word models used to match a word in an utterance of a user with a word model in said first set, wherein said first set of word models includes models for each of a plurality of words;
a second set of speaker dependent word models derived from speech of a particular user and used to match a word in an utterance of said particular user, wherein said second set of word models includes models for at least some of said plurality of words; and
a program portion used to identify words in utterances of said particular user by attempting to match portions of an audio signal with;
word models among said first set; and
word models among said second set, wherein said identified words in the utterances of said particular user include user-selected words for invoking commands.
-
-
2. A method of operating a speech recognition system comprising:
-
storing a first set of speaker-independent word models used to match a word in an utterance of any user with a word model in said first set, said first set of word models including models for each of a plurality of words;
storing a second set of speaker dependent word models derived from speech of a particular user, said second set of word models including models for at least some of said plurality of words and at least one model of said second set chosen by said particular user to initiate performance of at least one of a plurality of system commands;
recognizing words in utterances of said particular user by attempting to match portions of an audio signal with;
word models among said first set; and
word models among said second set; and
performing at least one system command in response to a recognized word within said utterances of said particular user. - View Dependent Claims (3)
-
-
4. A method of operating a speech recognition system comprising:
-
storing a first set of speaker-independent word models used to match a word in an utterance of any user with a word model in said first set;
storing a second set of speaker dependent word models derived from speech of a particular user by;
inviting said particular user upon first use of said speech recognition system to speak training words for deriving said second set;
deriving said second set from said training words; and
storing said second set;
associating at least one stored word model with a command token also associated with a default command word model; and
recognizing words in utterances of said particular user by attempting to match portions of an audio signal with;
word models among said first set; and
word models among said second set.
-
-
5. A method of operating a speech recognition system comprising:
-
storing a first set of speaker-independent word models used to match a word in an utterance of any user with a word model in said first set;
storing a second set of speaker dependent word models derived from speech of a particular user by;
determining a likelihood of recognizing a spoken word using said first set;
deriving a word model from a spoken word marginally recognized using said first set;
storing said word model in said second set; and
recognizing words in utterances of said particular user by attempting to match portions of an audio signal with;
word models among said first set; and
word models among said second set.
-
-
6. A method of enhancing speech recognition comprising:
-
providing a set of user-independent word models derived from utterances of a plurality of speakers, said set of user-independent word models including models for each of a plurality of words;
providing a set of user-dependent word models for ones of a plurality of users each derived from utterances of one of said users, said set of user-dependent word models including models for at least some of said plurality of words;
matching an utterance from one of said users to one of said user-independent word models;
matching an other utterance from said one of said users to one of said user-dependent word models; and
responsive to matching either said utterance to said one of said user-independent word models or said other utterance to said one of said user-dependent word models, initiating a command associated with both said one of said user-independent word models and said one of said user-dependent word models.
-
-
7. A method of enhancing speech recognition comprising:
-
providing a set of user-independent word models derived from utterances of a plurality of speakers, at least one user-independent word model representing a first word and associated with a command token;
providing a set of user-dependent word models for ones of a plurality of users each derived from utterances of one of said users, at least one user-dependent word model representing a second word different than the first word and associated with said command token, said user-dependent word models each derived by;
inviting a new user to speak training words for deriving a set of user-dependent word models;
deriving said set of user-dependent models from said training words;
storing said set of user-dependent word models; and
associating a user-dependent word model with a command token designated by said one of said users;
matching an utterance from one of said users to one of said user-independent word models; and
matching an other utterance from said one of said users to one of said user-dependent word models. - View Dependent Claims (8, 9)
-
Specification