×

System and method for synthesizing human speech using multiple speakers and context

  • US 9,368,104 B2
  • Filed: 03/15/2013
  • Issued: 06/14/2016
  • Est. Priority Date: 04/30/2012
  • Status: Active Grant
First Claim
Patent Images

1. A method of synthesizing speech from text, comprising the steps of:

  • receiving text from which speech will be synthesized;

    selecting, based on the received text, one or more scenario parameters, wherein the one or more scenario parameters are selected from the group consisting of language, dialect, accent, phonetic reduction, domain, context, and speaker number;

    identifying text metadata within the received text, wherein the text metadata comprises elements other than words within the text;

    parsing the received text, other than the identified text metadata, into a plurality of corresponding phonetic components;

    merging said plurality of phonetic components with breathing and non-speech effects to produce a transcript of phoneme segment strings corresponding to the received text;

    producing prosody contour data from said one or more selected scenario parameters and said transcript of phoneme segment strings;

    producing stitched filter data from said one or more selected scenario parameters and said transcript of phoneme segment strings;

    synthesizing speech from said stitched filter data and said prosody contour data; and

    outputting said synthesized speech from a playback device.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×