CORRECTING UNINTELLIGIBLE SYNTHESIZED SPEECH

US 20130080173A1
Filed: 09/27/2011
Published: 03/28/2013
Est. Priority Date: 09/27/2011
Status: Active Grant

First Claim

Patent Images

1. A method of speech synthesis, comprising the steps of:

(a) receiving a text input in a text-to-speech system;

(b) processing the text input into synthesized speech using a processor of the system;

(c) establishing that the synthesized speech is unintelligible;

(d) reprocessing the text input into subsequent synthesized speech to correct the unintelligible synthesized speech; and

(e) outputting the subsequent synthesized speech to a user via a loudspeaker.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and system of speech synthesis. A text input is received in a text-to-speech system and, using a processor of the system, the text input is processed into synthesized speech which is established as unintelligible. The text input is reprocessed into subsequent synthesized speech and output to a user via a loudspeaker to correct the unintelligible synthesized speech. In one embodiment, the synthesized speech can be established as unintelligible by predicting intelligibility of the synthesized speech, and determining that the predicted intelligibility is lower than a minimum threshold. In another embodiment, the synthesized speech can be established as unintelligible by outputting the synthesized speech to the user via the loudspeaker, and receiving an indication from the user that the synthesized speech is not intelligible.

Citations

20 Claims

1. A method of speech synthesis, comprising the steps of:
- (a) receiving a text input in a text-to-speech system;
  
  (b) processing the text input into synthesized speech using a processor of the system;
  
  (c) establishing that the synthesized speech is unintelligible;
  
  (d) reprocessing the text input into subsequent synthesized speech to correct the unintelligible synthesized speech; and
  
  (e) outputting the subsequent synthesized speech to a user via a loudspeaker.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1 wherein step (c) includes:
    - (c1) predicting intelligibility of the synthesized speech; and
      
      (c2) determining that the predicted intelligibility from step (c1) is lower than a minimum threshold.
  - 3. The method of claim 2 further comprising, between steps (c) and (d):
    - (f) adapting a model used in conjunction with step (d).
  - 4. The method of claim 3 further comprising, after step (e):
    - (g) predicting intelligibility of the subsequent synthesized speech;
      
      (h) determining whether the predicted intelligibility from step (g) is lower than the minimum threshold;
      
      (i) outputting the subsequent synthesized speech to the user via the loudspeaker if the predicted intelligibility is determined to be not lower than the minimum threshold in step (h); and
      
      , otherwise(j) repeating steps (f) through (j).
  - 5. The method of claim 1 wherein step (c) includes:
    - (c1) outputting the synthesized speech to the user via the loudspeaker; and
      
      (c2) receiving an indication from the user that the synthesized speech is not intelligible.
  - 6. The method of claim 5 wherein in step (d) the subsequent synthesized speech is simpler than the synthesized speech.
  - 7. The method of claim 5 wherein in step (d) the subsequent synthesized speech is slower than the synthesized speech.
  - 8. The method of claim 5 further comprising identifying a communication ability of the user, wherein in step (d) the subsequent synthesized speech is produced based on the identified communication ability.
  - 9. The method of claim 8 wherein in step (d) the subsequent synthesized speech is slower than the synthesized speech.
  - 10. The method of claim 9 wherein in step (d) the subsequent synthesized speech is simpler than the synthesized speech.

11. A method of speech synthesis, comprising the steps of:
- (a) receiving a text input in a text-to-speech system;
  
  (b) processing the text input into synthesized speech using a processor of the system;
  
  (c) predicting intelligibility of the synthesized speech;
  
  (d) determining whether the predicted intelligibility from step (c) is lower than a minimum threshold;
  
  (e) outputting the synthesized speech to a user via a loudspeaker if the predicted intelligibility is determined to be not lower than the minimum threshold in step (d);
  
  (f) adapting a model used in conjunction with processing the text input if the predicted intelligibility is determined to be lower than the minimum threshold in step (d);
  
  (g) reprocessing the text input into subsequent synthesized speech;
  
  (h) predicting intelligibility of the subsequent synthesized speech;
  
  (i) determining whether the predicted intelligibility from step (h) is lower than the minimum threshold;
  
  (j) outputting the subsequent synthesized speech to the user via the loudspeaker if the predicted intelligibility is determined to be not lower than the minimum threshold in step (i); and
  
  , otherwise(k) repeating steps (f) through (k).
- View Dependent Claims (12, 13, 14, 15, 16)
- - 12. The method of claim 11, wherein the model in step (f) is a Hidden Markov Model that is adapted using a Maximum Likelihood Linear Regression algorithm.
  - 13. The method of claim 11 wherein the predicting intelligibility step includes calculating a speech intelligibility score including a sum of weighted prosodic attributes.
  - 14. The method of claim 13 wherein the weighted prosodic attributes include at least two of intonation, speaking rate, spectral energy, pitch, or stress.
  - 15. The method of claim 13 wherein the adapted model is based on at least one of an articulation index, a speech transmission index, or a speech interference level.
  - 16. The method of claim 11 wherein the adapted model is based on at least one of an articulation index, a speech transmission index, or speech interference level.

17. A method of speech synthesis, comprising the steps of:
- (a) receiving a text input in a text-to-speech system;
  
  (b) processing the text input into synthesized speech using a processor of the system;
  
  (c1) outputting the synthesized speech to the user via the loudspeaker;
  
  (c2) receiving an indication from the user that the synthesized speech is not intelligible;
  
  (d) reprocessing the text input into subsequent synthesized speech to correct the unintelligible synthesized speech; and
  
  (e) outputting the subsequent synthesized speech to a user via a loudspeaker.
- View Dependent Claims (18, 19, 20)
- - 18. The method of claim 17 further comprising identifying a communication ability of the user, wherein in step (d) the subsequent synthesized speech is produced based on the identified communication ability.
  - 19. The method of claim 17 wherein in step (d) the subsequent synthesized speech is simpler than the synthesized speech.
  - 20. The method of claim 17 wherein in step (d) the subsequent synthesized speech is slower than the synthesized speech.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
General Motors LLC (General Motors Company)
Original Assignee
General Motors LLC (General Motors Company)
Inventors
Talwar, Gaurav, Chengalvarayan, Rathinavelu

Granted Patent

US 9,082,414 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/260
CPC Class Codes

G10L 13/033 Voice editing, e.g. manipul...

G10L 25/69 for evaluating synthetic or...

CORRECTING UNINTELLIGIBLE SYNTHESIZED SPEECH

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

CORRECTING UNINTELLIGIBLE SYNTHESIZED SPEECH

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links