Speech synthesis apparatus and method

US 7,062,439 B2
Filed: 08/11/2003
Issued: 06/13/2006
Est. Priority Date: 06/04/2001
Status: Expired due to Fees

First Claim

Patent Images

1. Speech synthesis apparatus comprising:

a language generator arranged to be responsive to semantic input information indicative of at least the content of a desired speech output, to generate a corresponding text-form utterance;

a text-to-speech converter for converting text-form utterances received from the language generator into speech form; and

an assessment arrangement for assessing overall quality of the speech form produced by the text-to-speech converter from an input text-form utterance whereby to selectively produce an inadequacy indicator in response to the assessment arrangement determining that the current speech form is of inadequate overall quality, the language generator being arranged to respond to the assessment arrangement producing one of said inadequacy indications, to generate from the same said semantic input information, and without corrective input from the assessment arrangement, a new but differently worded version of the text-form utterance concerned.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech synthesizer has a language generator for generating a text-form utterance from input semantic information and a text-to-speech converter for converting the text-from utterance into speech form. The overall quality of the speech-form utterance produced by the text-to-speech converter, is assessed and if judged inadequate, the language generator is triggered to produce a new version of the text-form utterance. The assessment of the overall quality of the speech form utterance is preferably effected by a classifier fed with feature values generated during the conversion process operated by the text-to-speech converter.

Citations

9 Claims

1. Speech synthesis apparatus comprising:
- a language generator arranged to be responsive to semantic input information indicative of at least the content of a desired speech output, to generate a corresponding text-form utterance;
  
  a text-to-speech converter for converting text-form utterances received from the language generator into speech form; and
  
  an assessment arrangement for assessing overall quality of the speech form produced by the text-to-speech converter from an input text-form utterance whereby to selectively produce an inadequacy indicator in response to the assessment arrangement determining that the current speech form is of inadequate overall quality, the language generator being arranged to respond to the assessment arrangement producing one of said inadequacy indications, to generate from the same said semantic input information, and without corrective input from the assessment arrangement, a new but differently worded version of the text-form utterance concerned.
- View Dependent Claims (2, 3, 4)
- - 2. Apparatus according to claim 1, wherein the text-to-speech converter is arranged to generate, in the course of converting a text-form utterance into speech form, values of predetermined features that are indicative of the overall quality of the speech form of the utterance, the assessment arrangement comprising:
    - a classifier arranged to be responsive to the feature values generated by the text-to-speech converter to provide a confidence measure of the speech form of the utterance concerned; and
      
      a comparator for comparing confidence measures produced by the classifier against one or more stored threshold values, in order to determine whether to produce said inadequacy indicator.
  - 3. Apparatus according to claim 1, wherein the text-to-speech converter includes a concatenative speech generator which in generating a speech-form utterance, is arranged to produce an accumulated unit selection cost in respect of the speech units used to make up the speech-form utterance, the assessment arrangement comprising a comparator for comparing the selection cost produced by the speech generator against one or more stored threshold values, in order to determine whether to produce said inadequacy indicator.
  - 4. Apparatus according to claim 1, further comprising an output buffer for temporarily storing the latest speech-form utterance generated by the text-to-speech converter, the assessment arrangement releasing this speech-form utterance for output upon determining that a new version is not required.

5. A method of generating speech output comprising the steps of:
- (a) in response to semantic input information indicative of at least the content of a desired speech output, generating a corresponding text-form utterance;
  
  (b) converting the text-form utterances generated in step (a) into speech form;
  
  (c) assessing overall quality of the speech form produced in step (b) and selectively producing an inadequacy indicator when the current speech form is assessed as of inadequate overall quality; and
  
  (d) upon an inadequacy indicator being produced in step (c), generating from the same said semantic input information, and without corrective input from the assessment in step (c) a new but differently worded version of the text-form utterance that gave rise to the inadequacy indicator.
- View Dependent Claims (6, 7, 8)
- - 6. A method according to claim 5, wherein in step (b), in the course of converting a text-form utterance into speech form, values of predetermined features are generated that are indicative of the overall quality of the speech form of the utterance, the assessment carried out in step (c) including:
    - using a classifier responsive to said values of predetermined features to provide a confidence measure of the speech form of the utterance concerned; and
      
      comparing confidence measures produced by the classifier against one or more stored threshold values, in order to determine whether to produce said inadequacy indicator.
  - 7. A method according to claim 5, wherein step (b) is effected using a concatenative speech generator which in generating a speech-form utterance, produces an accumulated unit selection cost in respect of the speech units used to make up the speech-form utterance;
    - step (c) including comparing this selection cost against one or more stored threshold values, in order to determine whether to produce said inadequacy indicator.
  - 8. A method according to claim 5, further including temporarily storing the latest speech-form utterance generated in step (b) and only releasing this speech-form utterance for output upon the assessment of this speech-form utterance in step (c) not resulting in the production of an inadequacy indicator.

9. Speech synthesis apparatus comprising:
- a language generator arranged to generate, from semantic input information indicative of at least the content of a desired speech output, a corresponding text-form utterance;
  
  a text-to-speech converter for converting said text-form utterance into speech form; and
  
  an assessment arrangement for assessing overall quality of said speech form whereby to selectively produce an inadequacy indicator when the current speech form is assessed as being of inadequate overall quality, the language generator being arranged to respond to the production of said inadequacy indication, to generate from the same said semantic input information, and without corrective input from the assessment arrangement, a new but differently worded version of the text-form utterance concerned.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Hewlett-Packard Development Company, L.P. (HP Inc.)
Original Assignee
Hewlett-Packard Development Company, L.P. (HP Inc.)
Inventors
Tucker, Roger Cecil Ferry, Brittan, Paul St John
Primary Examiner(s)
Knepper, David D.
Assistant Examiner(s)
Han, Qh

Application Number

US10/638,078
Publication Number

US 20040049375A1
Time in Patent Office

1,037 Days
Field of Search

704/260, 704/258, 704/220
US Class Current

704/260
CPC Class Codes

G10L 13/027   Concept to speech synthesis...

G10L 13/07   Concatenation rules

G10L 13/08   Text analysis or generation...

Speech synthesis apparatus and method

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

Citations

9 Claims

Specification

Solutions

Use Cases

Quick Links

Speech synthesis apparatus and method

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

9 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links