×

Method, apparatus and computer program providing a multi-speaker database for concatenative text-to-speech synthesis

  • US 7,716,052 B2
  • Filed: 04/07/2005
  • Issued: 05/11/2010
  • Est. Priority Date: 04/07/2005
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method comprising:

  • receiving a text word; and

    in response to receiving the text word, concatenating, by a data processor coupled to a memory, pre-recorded speech segments that are derived from a plurality of speakers to form audio data configured to generate an audible speech word that corresponds to the text word,wherein concatenating the pre-recorded speech segments comprises selecting speech segments for concatenation based on at least one cost function,where the at least one cost function comprises a first cost function where a cost of a speech segment from a particular speaker of the plurality of speakers is based at least in part on a size of a dataset comprising pre-recorded speech segments from the particular speaker as compared to sizes of other datasets each comprising pre-recorded speech segments from other speakers in the plurality of speakers,where the first cost function assigns a first cost for a first speech segment from a first speaker of the plurality of speakers that is lower than a second cost for a second speech segment from a second speaker of the plurality of speakers,where a first size of pre-recorded speech segments in a first dataset from the first speaker is greater than a second size of pre-recorded speech segments in a second dataset from the second speaker.

View all claims
  • 9 Assignments
Timeline View
Assignment View
    ×
    ×