Realtime assessment of TTS quality using single ended audio quality measurement

US 9,865,249 B2
Filed: 03/22/2016
Issued: 01/09/2018
Est. Priority Date: 03/22/2016
Status: Active Grant

First Claim

Patent Images

1. A method of regulating speech output by a text-to-speech (TTS) system having an electronic processor and a database, comprising the steps of:

(a) evaluating speech that has been converted from text using an initial speech quality test before presentation to a user;

(b) applying a classification test to the evaluated speech when the evaluated speech falls below a threshold based on the initial speech quality test;

(c) generating an abnormal speech classification for the evaluated speech; and

(d) applying a corrective action to the evaluated speech based on the abnormal speech classification in step (c), wherein one or more of steps (a), (b), (c), and (d) are performed using the electronic processor, and at least some data relating to the initial speech quality test, the classification test, or the corrective action is stored in the database.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A system and method of regulating speech output by a text-to-speech (TTS) system includes: evaluating speech that has been converted from text using an initial speech quality test before presentation to a user; applying a classification test to the evaluated speech if the evaluated speech falls below a threshold based on the initial speech quality test; generating an abnormal speech classification for the evaluated speech; and applying a corrective action to the evaluated speech based on the abnormal speech classification.

Citations

13 Claims

1. A method of regulating speech output by a text-to-speech (TTS) system having an electronic processor and a database, comprising the steps of:
- (a) evaluating speech that has been converted from text using an initial speech quality test before presentation to a user;
  
  (b) applying a classification test to the evaluated speech when the evaluated speech falls below a threshold based on the initial speech quality test;
  
  (c) generating an abnormal speech classification for the evaluated speech; and
  
  (d) applying a corrective action to the evaluated speech based on the abnormal speech classification in step (c), wherein one or more of steps (a), (b), (c), and (d) are performed using the electronic processor, and at least some data relating to the initial speech quality test, the classification test, or the corrective action is stored in the database.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein the initial speech quality test is a non-intrusive speech quality assessment.
  - 3. The method of claim 1, wherein the initial speech quality test is defined by the International Telecommunication Union (ITU) P.563 algorithm.
  - 4. The method of claim 1, wherein the classification test comprises one or more Hidden Markov Models (HMMs) that are each trained using training speech including one abnormal speech type.
  - 5. The method of claim 1, wherein the classification of evaluated speech includes an improper pause classification, an abnormal speaking rate classification, a poor enunciation classification, or an abnormal intonation classification.
  - 6. The method of claim 1, further comprising the step of audibly presenting the corrected speech to the user.
  - 7. The method of claim 1, further comprising the step of evaluating speech that has been converted from text using a speech model built from a user'"'"'s voice.

8. A method of regulating the quality of speech output by a text-to-speech (TTS) system having an electronic processor and a database, comprising the steps of:
- (a) applying a plurality of Hidden Markov Models (HMMs) to speech converted from text before presentation to a user, wherein the HMMs have each been trained using training speech that includes a different type of speech deficiency;
  
  (b) determining a confidence value for the speech using each of the plurality of HMMs;
  
  (c) generating a reference confidence value for the speech converted from text using an HMM trained using live reference speech;
  
  (d) determining whether any of the confidence values determined in step (b) indicate an abnormal speech classification;
  
  (e) calculating a distance between the reference confidence value determined in step (c) and the confidence values determined in step (b) using the HMMs trained on training speech that includes classified impairments;
  
  (f) correlating the calculated distance with output from a speech quality test that is defined by the International Telecommunication Union (ITU) P.563 algorithm; and
  
  (g) applying a corrective action to the evaluated speech when the abnormal speech classification is present, wherein one or more of steps (a), (b), (c), (d), (e), (f), and (g) are performed using the electronic processor, and at least some data relating to the HMMs, the speech quality test, or the corrective action is stored in the database.
- View Dependent Claims (9)
- - 9. The method of claim 8, wherein the abnormal speech classification includes an improper pause classification, an abnormal speaking rate classification, a poor enunciation classification, or an abnormal intonation classification.

10. A method of regulating the quality of speech output by a text-to-speech (TTS) system having an electronic processor and a database, comprising the steps of:
- (a) training a Hidden Markov Model (HMM) on speech converted from text before presentation to a user;
  
  (b) comparing the HMM to a reference HMM that has been trained on human speech;
  
  (c) determining the distance between the HMM trained on speech converted from text and the reference HMM;
  
  (d) comparing the distance to a threshold;
  
  (e) correlating the distance between the HMM trained on speech converted from text and the reference HMM with an output from an initial speech quality test; and
  
  (f) applying a corrective action to the speech converted from text when the distance exceeds the threshold, wherein one or more of steps (a), (b), (c), (d), (e), and (f) are performed using the electronic processor, and at least some data relating to the initial speech quality test, the classification test, or the corrective action is stored in the database.
- View Dependent Claims (11, 12, 13)
- - 11. The method of claim 10, wherein the reference HMM is trained on text-independent speech.
  - 12. The method of claim 10, wherein the reference HMM is trained on gender-independent speech.
  - 13. The method of claim 10, wherein the initial speech quality test is defined by the International Telecommunication Union (ITU) P.563 algorithm.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
GM Global Technology Operations LLC (General Motors Company)
Original Assignee
GM Global Technology Operations LLC (General Motors Company)
Inventors
Talwar, Gaurav, Pennock, Scott M., Grost, Timothy J.
Primary Examiner(s)
Baker, Charlotte M

Application Number

US15/077,163
Publication Number

US 20170278506A1
Time in Patent Office

658 Days
Field of Search

704235, 704244, 704251, 704254, 704260, 704E15002, 704E15005, 379 102, 370352, 714712, 714776
US Class Current
CPC Class Codes

G10L 13/04   Details of speech synthesis...

G10L 15/144   Training of HMMs

G10L 25/27   characterised by the analys...

G10L 25/69   for evaluating synthetic or...

Realtime assessment of TTS quality using single ended audio quality measurement

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

13 Claims

Specification

Solutions

Use Cases

Quick Links

Realtime assessment of TTS quality using single ended audio quality measurement

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

13 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links