Automatic speech recognition for disfluent speech

US 10,255,913 B2
Filed: 02/17/2016
Issued: 04/09/2019
Est. Priority Date: 02/17/2016
Status: Active Grant

First Claim

Patent Images

1. A method of processing disfluent speech at an automatic speech recognition (ASR) system having an electronic processor and a database, comprising the steps of:

(a) receiving speech from a speaker via a microphone;

(b) determining the received speech includes disfluent speech and determining the classification of received speech using a Hidden Markov Model (HMM), wherein the HMM is trained using speakers that have a speech disorder or speakers that speak as though they have a speech disorder;

(c) accessing a disfluent speech grammar or acoustic model in response to step (b); and

(d) after steps (b) and (c), processing the received speech using the disfluent speech grammar, wherein the processing of the received speech includes using the disfluent speech grammar to generate a more accurate hypothesis of speech content, and wherein one or more of steps (b), (c), and (d) are performed using the electronic processor, and at least some data relating to the received speech, the HMM, the disfluent speech grammar, or the acoustic model is stored in the database.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method of processing disfluent speech at an automatic speech recognition (ASR) system includes: receiving speech from a speaker via a microphone; determining the received speech includes disfluent speech; accessing a disfluent speech grammar or acoustic model in response to the determination; and processing the received speech using the disfluent speech grammar.

24 Citations

View as Search Results

12 Claims

1. A method of processing disfluent speech at an automatic speech recognition (ASR) system having an electronic processor and a database, comprising the steps of:
- (a) receiving speech from a speaker via a microphone;
  
  (b) determining the received speech includes disfluent speech and determining the classification of received speech using a Hidden Markov Model (HMM), wherein the HMM is trained using speakers that have a speech disorder or speakers that speak as though they have a speech disorder;
  
  (c) accessing a disfluent speech grammar or acoustic model in response to step (b); and
  
  (d) after steps (b) and (c), processing the received speech using the disfluent speech grammar, wherein the processing of the received speech includes using the disfluent speech grammar to generate a more accurate hypothesis of speech content, and wherein one or more of steps (b), (c), and (d) are performed using the electronic processor, and at least some data relating to the received speech, the HMM, the disfluent speech grammar, or the acoustic model is stored in the database.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The method of claim 1, wherein the speaker is a vehicle occupant.
  - 3. The method of claim 1, further comprising the step of classifying the received speech according to one or more types of disfluent speech.
  - 4. The method of claim 1, wherein the disfluent speech grammar or acoustic model is stored at a vehicle.
  - 5. The method of claim 1, wherein the received speech is determined to include disfluent speech based on a speech hypothesis that falls below a predetermined threshold.

6. A method of processing disfluent speech at an automatic speech recognition (ASR) system having an electronic processor and a database, comprising the steps of:
- (a) receiving speech from a speaker via a microphone;
  
  (b) classifying the received speech according to one of a plurality of different disfluent speech types, wherein the plurality of different disfluent speech types includes stammer, excessive breath, or nasality;
  
  (c) modifying one or more ASR variables that compensate for the classified disfluent speech type; and
  
  (d) after steps (b) and (c), processing the received speech using the modified ASR variables, wherein the processing of the received speech includes using the modified ASR variables to generate a more accurate hypothesis of speech content, and wherein one or more of steps (b), (c), and (d) are performed using the electronic processor, and at least some data relating to the received speech or the modified ASR variables is stored in the database.
- View Dependent Claims (7, 8, 9, 10, 11)
- - 7. The method of claim 6, wherein the speaker is a vehicle occupant.
  - 8. The method of claim 6, wherein step (b) further comprises determining the classification of received speech using a Hidden Markov Model (HMM).
  - 9. The method of claim 8, wherein the HMM is trained using speakers providing disfluent speech.
  - 10. The method of claim 6, wherein the disfluent speech grammar or acoustic model is stored at a vehicle.
  - 11. The method of claim 6, wherein the received speech is determined to include disfluent speech based on a speech hypothesis that falls below a predetermined threshold.

12. A method of processing disfluent speech at an automatic speech recognition (ASR) system having an electronic processor and a database, comprising the steps of:
- (a) receiving speech from a speaker via a microphone;
  
  (b) classifying the received speech according to one of a plurality of different disfluent speech types, wherein the classification of received speech uses a Hidden Markov Model (HMM), wherein the HMM is trained using speakers and the HMM is trained to identify a within word difluence resulting from a stammer, excessive breath, slow rate, or nasality;
  
  (c) modifying one or more ASR variables that compensate for the stammer, excessive breath, slow rate, or nasality of the classified disfluent speech type, wherein the one or more ASR variables includes increasing response time for audible prompts, allowing repetitive command words, or both; and
  
  (d) processing the received speech using the modified ASR variables by increasing response time for audible prompts, allowing repetitive command words, or both, wherein one or more of steps (b), (c), and (d) are performed using the electronic processor, and at least some data relating to the received speech, the HMM, or the modified ASR variables is stored in the database.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
GM Global Technology Operations LLC (General Motors Company)
Original Assignee
GM Global Technology Operations LLC (General Motors Company)
Inventors
Zhao, Xufang, Talwar, Gaurav
Primary Examiner(s)
Colucci, Michael C

Application Number

US15/046,303
Publication Number

US 20170236511A1
Time in Patent Office

1,147 Days
Field of Search

704500, 704240, 704251, 704257, 704235, 704254
US Class Current
CPC Class Codes

G10L 15/01   Assessment or evaluation of...

G10L 15/10   using distance or distortio...

G10L 15/142   Hidden Markov Models [HMMs]

G10L 15/144   Training of HMMs

G10L 15/19   Grammatical context, e.g. d...

G10L 15/22   Procedures used during a sp...

G10L 2015/223   Execution procedure of a sp...

G10L 25/60   for measuring the quality o...

Automatic speech recognition for disfluent speech

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

24 Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

Automatic speech recognition for disfluent speech

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

24 Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links