Automatic speech recognition for disfluent speech
First Claim
Patent Images
1. A method of processing disfluent speech at an automatic speech recognition (ASR) system having an electronic processor and a database, comprising the steps of:
- (a) receiving speech from a speaker via a microphone;
(b) determining the received speech includes disfluent speech and determining the classification of received speech using a Hidden Markov Model (HMM), wherein the HMM is trained using speakers that have a speech disorder or speakers that speak as though they have a speech disorder;
(c) accessing a disfluent speech grammar or acoustic model in response to step (b); and
(d) after steps (b) and (c), processing the received speech using the disfluent speech grammar, wherein the processing of the received speech includes using the disfluent speech grammar to generate a more accurate hypothesis of speech content, and wherein one or more of steps (b), (c), and (d) are performed using the electronic processor, and at least some data relating to the received speech, the HMM, the disfluent speech grammar, or the acoustic model is stored in the database.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method of processing disfluent speech at an automatic speech recognition (ASR) system includes: receiving speech from a speaker via a microphone; determining the received speech includes disfluent speech; accessing a disfluent speech grammar or acoustic model in response to the determination; and processing the received speech using the disfluent speech grammar.
24 Citations
12 Claims
-
1. A method of processing disfluent speech at an automatic speech recognition (ASR) system having an electronic processor and a database, comprising the steps of:
-
(a) receiving speech from a speaker via a microphone; (b) determining the received speech includes disfluent speech and determining the classification of received speech using a Hidden Markov Model (HMM), wherein the HMM is trained using speakers that have a speech disorder or speakers that speak as though they have a speech disorder; (c) accessing a disfluent speech grammar or acoustic model in response to step (b); and (d) after steps (b) and (c), processing the received speech using the disfluent speech grammar, wherein the processing of the received speech includes using the disfluent speech grammar to generate a more accurate hypothesis of speech content, and wherein one or more of steps (b), (c), and (d) are performed using the electronic processor, and at least some data relating to the received speech, the HMM, the disfluent speech grammar, or the acoustic model is stored in the database. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method of processing disfluent speech at an automatic speech recognition (ASR) system having an electronic processor and a database, comprising the steps of:
-
(a) receiving speech from a speaker via a microphone; (b) classifying the received speech according to one of a plurality of different disfluent speech types, wherein the plurality of different disfluent speech types includes stammer, excessive breath, or nasality; (c) modifying one or more ASR variables that compensate for the classified disfluent speech type; and (d) after steps (b) and (c), processing the received speech using the modified ASR variables, wherein the processing of the received speech includes using the modified ASR variables to generate a more accurate hypothesis of speech content, and wherein one or more of steps (b), (c), and (d) are performed using the electronic processor, and at least some data relating to the received speech or the modified ASR variables is stored in the database. - View Dependent Claims (7, 8, 9, 10, 11)
-
-
12. A method of processing disfluent speech at an automatic speech recognition (ASR) system having an electronic processor and a database, comprising the steps of:
-
(a) receiving speech from a speaker via a microphone; (b) classifying the received speech according to one of a plurality of different disfluent speech types, wherein the classification of received speech uses a Hidden Markov Model (HMM), wherein the HMM is trained using speakers and the HMM is trained to identify a within word difluence resulting from a stammer, excessive breath, slow rate, or nasality; (c) modifying one or more ASR variables that compensate for the stammer, excessive breath, slow rate, or nasality of the classified disfluent speech type, wherein the one or more ASR variables includes increasing response time for audible prompts, allowing repetitive command words, or both; and (d) processing the received speech using the modified ASR variables by increasing response time for audible prompts, allowing repetitive command words, or both, wherein one or more of steps (b), (c), and (d) are performed using the electronic processor, and at least some data relating to the received speech, the HMM, or the modified ASR variables is stored in the database.
-
Specification