Generating objectively evaluated sufficiently natural synthetic speech from text by using selective paraphrases

  • US 8,015,011 B2
  • Filed: 01/30/2008
  • Issued: 09/06/2011
  • Est. Priority Date: 01/30/2007
  • Status: Active Grant
  • ×
    • Pin Icon | RPX Insight
    • Pin
First Claim
Patent Images

1. A system for generating synthetic speech, comprising:

  • a phoneme segment storage section operable to store a plurality of phoneme segment data pieces indicating a plurality of sounds of phonemes which are different from each other; and

    a synthesis section operable to generate voice data representing synthetic speech of text by receiving an inputted text, reading out phoneme segment data pieces that correspond to respective phonemes indicating the pronunciation of the inputted text, and connecting the read-out phoneme segment data pieces to each other;

    a computing section operable to compute a score indicating naturalness of the synthetic speech of the text, on the basis of the voice data;

    a paraphrase storage section operable to store a plurality of notations each comprising a word or phrase, the plurality of notations comprising a plurality of first notations and a plurality of second notations, each second notation being a paraphrase of a respective first notation;

    a replacement section operable to search the text for a notation matching any of the first notations and to replace a matching notation with the second notation corresponding to the first notation; and

    a judgment section operable to receive the score computed by the computing section and determine whether the score indicates the synthetic speech is sufficiently natural, and;

    if the score indicates the synthetic speech is sufficiently natural, output the generated voice data; and

    if the score indicates the synthetic speech is not sufficiently natural, cause the replacement section to generate revised text by replacing at least one other notation in the inputted text matching a first notation with a corresponding second notation, and cause the synthesis section to generate voice data for the revised text.

View all claims
    ×
    ×

    Thank you for your feedback

    ×
    ×