User-specific confidence thresholds for speech recognition

US 8,639,508 B2
Filed: 02/14/2011
Issued: 01/28/2014
Est. Priority Date: 02/14/2011
Status: Active Grant

First Claim

Patent Images

1. A method of automatic speech recognition, comprising the steps of:

(a) receiving an utterance from a user via a microphone that converts the utterance into a speech signal;

(b) pre-processing the speech signal using a processor to extract acoustic data from the received speech signal;

(c) identifying at least one user-specific characteristic in response to the extracted acoustic data, wherein the at least one user-specific characteristic comprises a plurality of confidence scores associated with failed attempts of the user to store a nametag; and

(d) determining a user-specific confidence threshold responsive to the at least one user-specific characteristic, wherein the determination is carried out by calculating an average of the plurality of confidence scores and setting the user-specific confidence threshold to a value greater than or equal to the calculated average.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method of automatic speech recognition includes receiving an utterance from a user via a microphone that converts the utterance into a speech signal, pre-processing the speech signal using a processor to extract acoustic data from the received speech signal, and identifying at least one user-specific characteristic in response to the extracted acoustic data. The method also includes determining a user-specific confidence threshold responsive to the at least one user-specific characteristic, and using the user-specific confidence threshold to recognize the utterance received from the user and/or to assess confusability of the utterance with stored vocabulary.

102 Citations

View as Search Results

20 Claims

1. A method of automatic speech recognition, comprising the steps of:
- (a) receiving an utterance from a user via a microphone that converts the utterance into a speech signal;
  
  (b) pre-processing the speech signal using a processor to extract acoustic data from the received speech signal;
  
  (c) identifying at least one user-specific characteristic in response to the extracted acoustic data, wherein the at least one user-specific characteristic comprises a plurality of confidence scores associated with failed attempts of the user to store a nametag; and
  
  (d) determining a user-specific confidence threshold responsive to the at least one user-specific characteristic, wherein the determination is carried out by calculating an average of the plurality of confidence scores and setting the user-specific confidence threshold to a value greater than or equal to the calculated average.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, further comprising the step of:
    - (e) using the user-specific confidence threshold to recognize the utterance received from the user, wherein the user-specific confidence threshold is a recognition confidence threshold.
  - 3. The method of claim 1, further comprising the step of:
    - (e) using the user-specific confidence threshold to assess confusability of the utterance with stored vocabulary, wherein the user-specific confidence threshold is a confusability confidence threshold.
  - 4. The method of claim 1, wherein the step (d) determination is also carried out by first verifying that the plurality of confidence scores are within a predetermined range.
  - 5. The method of claim 4, wherein the predetermined range is plus or minus five percent.
  - 6. The method of claim 1, wherein the at least one user-specific characteristic includes at least one formant of the utterance.
  - 7. The method of claim 6, wherein the user-specific confidence threshold is determined using a multiple regression calculation including the at least one formant of the utterance and at least one formant coefficient developed from a plurality of development speakers.
  - 8. The method of claim 1, wherein the at least one user-specific characteristic includes pitch of the utterance.
  - 9. The method of claim 8, wherein the user-specific confidence threshold is determined using a multiple regression calculation including the pitch of the utterance and a pitch coefficient developed from a plurality of development speakers.
  - 10. The method of claim 1, wherein the at least one user-specific characteristic includes pitch and at least one formant of the utterance, and wherein the user-specific confidence threshold is determined using a multiple regression calculation including the pitch and the at least one formant of the utterance and a pitch coefficient and at least one formant coefficient developed from a plurality of development speakers.

11. A method of automatic speech recognition, comprising the steps of:
- (a) receiving an utterance from a user via a microphone that converts the utterance into a speech signal;
  
  (b) pre-processing the speech signal using a processor to extract acoustic data from the received speech signal;
  
  (c) identifying at least one user-specific characteristic including pitch and at least one formant in response to the extracted acoustic data;
  
  (d) determining a user-specific confidence threshold responsive to the identified at least one user-specific characteristics, wherein the determination comprises using a multiple regression calculation including the identified user-specific pitch and at least one formant and a pitch coefficient and at least one formant coefficient developed from a plurality of development speakers; and
  
  (e) decoding the acoustic data based on the user-specific confidence threshold to produce a plurality of hypotheses for the received utterance, including calculating confidence scores for the hypotheses.
- View Dependent Claims (12, 13, 14, 15, 16)
- - 12. The method of claim 11, further comprising the step of:
    - (f) post-processing the plurality of hypotheses, including using the user-specific confidence threshold to identify at least one hypothesis of the plurality of hypotheses as the received utterance, wherein the user-specific confidence threshold is a recognition confidence threshold.
  - 13. The method of claim 11, further comprising the step of:
    - (f) post-processing the plurality of hypotheses, including using the user-specific confidence threshold to assess confusability of the utterance with stored vocabulary, wherein the user-specific confidence threshold is a confusability confidence threshold.
  - 14. The method of claim 11, wherein the at least one user-specific characteristic of step (c) also includes a plurality of confidence scores associated with failed attempts of the user to store a nametag.
  - 15. The method of claim 14, wherein the step (d) determination is further carried out by calculating an average of the plurality of confidence scores, and setting the user-specific confidence threshold to a value that is relative to the calculated average.
  - 16. The method of claim 15, wherein the step (d) determination is also carried out by first verifying that the plurality of confidence scores are within a predetermined range.

17. A method of automatic speech recognition, comprising the steps of:
- (a) receiving an utterance from a user via a microphone that converts the utterance into a speech signal;
  
  (b) pre-processing the speech signal using a processor to extract acoustic data from the received speech signal;
  
  (c) identifying at least one user-specific characteristic including a plurality of confidence scores associated with failed attempts of the user to store a nametag;
  
  (d) determining a user-specific confidence threshold responsive to the at least one user-specific characteristic;
  
  (e) decoding the acoustic data to produce a plurality of hypotheses for the received utterance, including calculating confidence scores for the hypotheses; and
  
  (f) post-processing the plurality of hypotheses, including using the user-specific confidence threshold to assess confusability of the utterance with stored vocabulary, wherein the user-specific confidence threshold is a confusability confidence threshold.
- View Dependent Claims (18, 19, 20)
- - 18. The method of claim 17, wherein the step (d) determination is carried out by calculating an average of the plurality of confidence scores and setting the user-specific confidence threshold to a value greater than or equal to the calculated average.
  - 19. The method of claim 18, wherein the at least one user-specific characteristic also includes pitch and at least one formant of the utterance, and wherein the user-specific confidence threshold is also determined using a multiple regression calculation including the pitch and the at least one formant of the utterance, and a pitch coefficient and at least one formant coefficient developed from a plurality of development speakers.
  - 20. The method of claim 19, wherein the step (d) determination is carried out by setting the user-specific confidence threshold to a value that is relative to the calculated average or the multiple regression calculated value.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
General Motors LLC (General Motors Company)
Original Assignee
General Motors LLC (General Motors Company)
Inventors
Zhao, Xufang, Talwar, Gaurav
Primary Examiner(s)
Desir, Pierre-Louis
Assistant Examiner(s)
Sirjani, Fariba

Application Number

US13/026,670
Publication Number

US 20120209609A1
Time in Patent Office

1,079 Days
Field of Search

704/249, 704246-250
US Class Current

704/249
CPC Class Codes

G10L 15/08 Speech classification or se...

G10L 17/00 Speaker identification or v...

User-specific confidence thresholds for speech recognition

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

102 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

User-specific confidence thresholds for speech recognition

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

102 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links