SPEECH SYNTHESIZER, SPEECH SYNTHESIZING METHOD AND PROGRAM PRODUCT

US 20120089402A1
Filed: 10/12/2011
Published: 04/12/2012
Est. Priority Date: 04/15/2009
Status: Active Grant

First Claim

Patent Images

1. A speech synthesizer comprising:

an analyzer that performs a text analysis of an input document and extract a linguistic feature used for prosody control;

a first estimator that selects a first prosody model adapted to the extracted linguistic feature from predetermined first prosody models that are models of speech prosody information and that estimates prosody information that maximizes a first likelihood representing probability of the selected first prosody model;

a selector that selects, from a speech unit storage storing speech units, a plurality of speech units that minimizes a cost function determined in accordance with the prosody information estimated by the first estimator;

a generator that generates a second prosody model that is a model of prosody information of the selected speech units;

a second estimator that estimates prosody information that maximizes a third likelihood calculated on the basis of the first likelihood and a second likelihood representing probability of the second prosody model; and

a synthesizer that generates synthetic speech by concatenating the selected speech units on the basis of the prosody information estimated by the second estimator.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

According to one embodiment, a speech synthesizer includes an analyzer, a first estimator, a selector, a generator, a second estimator, and a synthesizer. The analyzer analyzes text and extracts a linguistic feature. The first estimator selects a first prosody model adapted to the linguistic feature and estimates prosody information that maximizes a first likelihood representing probability of the selected first prosody model. The selector selects speech units that minimize a cost function determined in accordance with the prosody information. The generator generates a second prosody model that is a model of the prosody information of the speech units. The second estimator estimates prosody information that maximizes a third likelihood calculated on the basis of the first likelihood and a second likelihood representing probability of the second prosody model. The synthesizer generates synthetic speech by concatenating the speech units on the basis of the prosody information estimated by the second estimator.

Citations

6 Claims

1. A speech synthesizer comprising:
- an analyzer that performs a text analysis of an input document and extract a linguistic feature used for prosody control;
  
  a first estimator that selects a first prosody model adapted to the extracted linguistic feature from predetermined first prosody models that are models of speech prosody information and that estimates prosody information that maximizes a first likelihood representing probability of the selected first prosody model;
  
  a selector that selects, from a speech unit storage storing speech units, a plurality of speech units that minimizes a cost function determined in accordance with the prosody information estimated by the first estimator;
  
  a generator that generates a second prosody model that is a model of prosody information of the selected speech units;
  
  a second estimator that estimates prosody information that maximizes a third likelihood calculated on the basis of the first likelihood and a second likelihood representing probability of the second prosody model; and
  
  a synthesizer that generates synthetic speech by concatenating the selected speech units on the basis of the prosody information estimated by the second estimator.
- View Dependent Claims (2, 3, 4)
- - 2. The speech synthesizer according to claim 1, whereinthe selector newly selects the speech units that minimize the cost function determined in accordance with the prosody information estimated by the second estimator, andthe synthesizer generates synthetic speech by concatenating the newly selected speech units on the basis of the prosody information estimated by the second estimator.
  - 3. The speech synthesizer according to claim 2, whereinthe generator further generates the second prosody model of the newly selected speech units,the second estimator further estimates prosody information that maximizes the third likelihood calculated on the basis of the second likelihood of the second prosody model generated from the newly selected speech units and the first likelihood, andthe synthesizer generates synthetic speech by concatenating the selected speech units on the basis of the prosody information estimated by the second estimator when the number of estimations of prosody information performed by the second estimator exceeds a predetermined threshold.
  - 4. The speech synthesizer according to claim 1, wherein the third likelihood is calculated by linearly coupling the first likelihood and the second likelihood.

5. A speech synthesis method comprising:
- performing a text analysis of an input document and extracting a linguistic feature used for prosody control;
  
  selecting a first prosody model adapted to the extracted linguistic feature from predetermined first prosody models that are models of speech prosody information, and first estimating in which prosody information that maximizes a first likelihood representing probability of the selected first prosody model is estimated;
  
  selecting, from a speech unit storage storing speech units, a plurality of speech units that minimizes a cost function determined in accordance with the prosody information estimated in the first estimating;
  
  generating a second prosody model that is a model of prosody information of the selected speech units;
  
  second estimating in which prosody information that maximizes a third likelihood calculated on the basis of the first likelihood and a second likelihood representing probability of the second prosody model is estimated; and
  
  generating synthetic speech by concatenating the selected speech units on the basis of the prosody information estimated in the second estimating.

6. A program product having a computer readable medium including programmed instructions, wherein the instructions, when executed by a computer, causes the computer to perform:
- performing an text analysis of an input document and extracting a linguistic feature used for prosody control;
  
  selecting a first prosody model adapted to the extracted linguistic feature from predetermined first prosody models that are models of speech prosody information, and first estimating in which prosody information that maximizes a first likelihood representing probability of the selected first prosody model is estimated;
  
  selecting, from a speech unit storage storing speech units, a plurality of speech units that minimizes a cost function determined in accordance with the prosody information estimated in the first estimating;
  
  generating a second prosody model that is a model of prosody information of the selected speech units;
  
  second estimating in which prosody information that maximizes a third likelihood calculated on the basis of the first likelihood and a second likelihood representing probability of the second prosody model is estimated; and
  
  generating synthetic speech by concatenating the selected speech units on the basis of the prosody information estimated in the second estimating.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Kabushiki Kaisha Toshiba (Toshiba Corporation)
Original Assignee
Kabushiki Kaisha Toshiba (Toshiba Corporation)
Inventors
Latorre, Javier, Akamine, Masami

Granted Patent

US 8,494,856 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/260
CPC Class Codes

G10L 13/10 Prosody rules derived from ...

SPEECH SYNTHESIZER, SPEECH SYNTHESIZING METHOD AND PROGRAM PRODUCT

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

6 Claims

Specification

Solutions

Use Cases

Quick Links

SPEECH SYNTHESIZER, SPEECH SYNTHESIZING METHOD AND PROGRAM PRODUCT

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

6 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links