Realtime assessment of TTS quality using single ended audio quality measurement
First Claim
Patent Images
1. A method of regulating speech output by a text-to-speech (TTS) system having an electronic processor and a database, comprising the steps of:
- (a) evaluating speech that has been converted from text using an initial speech quality test before presentation to a user;
(b) applying a classification test to the evaluated speech when the evaluated speech falls below a threshold based on the initial speech quality test;
(c) generating an abnormal speech classification for the evaluated speech; and
(d) applying a corrective action to the evaluated speech based on the abnormal speech classification in step (c), wherein one or more of steps (a), (b), (c), and (d) are performed using the electronic processor, and at least some data relating to the initial speech quality test, the classification test, or the corrective action is stored in the database.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method of regulating speech output by a text-to-speech (TTS) system includes: evaluating speech that has been converted from text using an initial speech quality test before presentation to a user; applying a classification test to the evaluated speech if the evaluated speech falls below a threshold based on the initial speech quality test; generating an abnormal speech classification for the evaluated speech; and applying a corrective action to the evaluated speech based on the abnormal speech classification.
-
Citations
13 Claims
-
1. A method of regulating speech output by a text-to-speech (TTS) system having an electronic processor and a database, comprising the steps of:
-
(a) evaluating speech that has been converted from text using an initial speech quality test before presentation to a user; (b) applying a classification test to the evaluated speech when the evaluated speech falls below a threshold based on the initial speech quality test; (c) generating an abnormal speech classification for the evaluated speech; and (d) applying a corrective action to the evaluated speech based on the abnormal speech classification in step (c), wherein one or more of steps (a), (b), (c), and (d) are performed using the electronic processor, and at least some data relating to the initial speech quality test, the classification test, or the corrective action is stored in the database. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method of regulating the quality of speech output by a text-to-speech (TTS) system having an electronic processor and a database, comprising the steps of:
-
(a) applying a plurality of Hidden Markov Models (HMMs) to speech converted from text before presentation to a user, wherein the HMMs have each been trained using training speech that includes a different type of speech deficiency; (b) determining a confidence value for the speech using each of the plurality of HMMs; (c) generating a reference confidence value for the speech converted from text using an HMM trained using live reference speech; (d) determining whether any of the confidence values determined in step (b) indicate an abnormal speech classification; (e) calculating a distance between the reference confidence value determined in step (c) and the confidence values determined in step (b) using the HMMs trained on training speech that includes classified impairments; (f) correlating the calculated distance with output from a speech quality test that is defined by the International Telecommunication Union (ITU) P.563 algorithm; and (g) applying a corrective action to the evaluated speech when the abnormal speech classification is present, wherein one or more of steps (a), (b), (c), (d), (e), (f), and (g) are performed using the electronic processor, and at least some data relating to the HMMs, the speech quality test, or the corrective action is stored in the database. - View Dependent Claims (9)
-
-
10. A method of regulating the quality of speech output by a text-to-speech (TTS) system having an electronic processor and a database, comprising the steps of:
-
(a) training a Hidden Markov Model (HMM) on speech converted from text before presentation to a user; (b) comparing the HMM to a reference HMM that has been trained on human speech; (c) determining the distance between the HMM trained on speech converted from text and the reference HMM; (d) comparing the distance to a threshold; (e) correlating the distance between the HMM trained on speech converted from text and the reference HMM with an output from an initial speech quality test; and (f) applying a corrective action to the speech converted from text when the distance exceeds the threshold, wherein one or more of steps (a), (b), (c), (d), (e), and (f) are performed using the electronic processor, and at least some data relating to the initial speech quality test, the classification test, or the corrective action is stored in the database. - View Dependent Claims (11, 12, 13)
-
Specification