Identification of unit overlap regions for concatenative speech synthesis system

US 6,202,049 B1
Filed: 03/09/1999
Issued: 03/13/2001
Est. Priority Date: 03/09/1999
Status: Expired due to Term

First Claim

Patent Images

1. A method for identifying a unit overlap region for concatenative speech synthesis, comprising:

defining a statistical model for representing time-varying properties of speech;

providing a plurality of time-series data corresponding to different sound units containing the same vowel;

extracting speech signal parameters from said time-series data and using said parameters to train said statistical model;

using said trained statistical model to identify a recurring sequence in said time-series data and associating said recurring sequence with a nuclear trajectory region of said vowel;

using said recurring sequence to delimit the unit overlap region for concatenative speech synthesis.

View all claims

4 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Speech signal parameters are extracted from time-series data corresponding to different sound units containing the same vowel. The extracted parameters are used to train a statistical model, such as a Hidden Markov-based Model, that has a data structure for separately modeling the nuclear trajectory region of the vowel and its surrounding transition elements. The model is trained as through embedded re-estimation to automatically determine optimally aligned models that identify the nuclear trajectory region. The boundaries of the nuclear trajectory region serve to delimit the overlap region for subsequent sound unit concatenation.

Citations

15 Claims

1. A method for identifying a unit overlap region for concatenative speech synthesis, comprising:
- defining a statistical model for representing time-varying properties of speech;
  
  providing a plurality of time-series data corresponding to different sound units containing the same vowel;
  
  extracting speech signal parameters from said time-series data and using said parameters to train said statistical model;
  
  using said trained statistical model to identify a recurring sequence in said time-series data and associating said recurring sequence with a nuclear trajectory region of said vowel;
  
  using said recurring sequence to delimit the unit overlap region for concatenative speech synthesis.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1 wherein said statistical model is a Hidden Markov Model.
  - 3. The method of claim 1 wherein said statistical model is a recurrent neural network.
  - 4. The method of claim 1 wherein said speech signal parameters are speech formants.
  - 5. The method of claim 1 wherein said statistical model has a data structure for separately modeling the nuclear trajectory region of a vowel and the transition elements surrounding said nuclear trajectory region.
  - 6. The method of claim 1 wherein the step of training said model is performed by embedded re-estimation to generate a converged model for alignment across the entire data set represented by said time-series data.
  - 7. The method of claim 1 wherein said statistical model has a data structure for separately modeling the nuclear trajectory region of a vowel, a first transition element preceding said nuclear trajectory region and a second transition element following said nuclear trajectory region;
    - and

8. A method for performing concatenative speech synthesis, comprising:
- defining a statistical model for representing time-varying properties of speech;
  
  providing a plurality of time-series data corresponding to different sound units containing the same vowel;
  
  extracting speech signal parameters from said time-series data and using said parameters to train said statistical model;
  
  using said trained statistical model to identify a recurring sequence in said time-series data and associating said recurring sequence with a nuclear trajectory region of said vowel;
  
  using said recurring sequence to delimit a unit overlap region for each of said sound units;
  
  concatenatively synthesizing a new sound unit by overlapping and merging said time-series data from two of said different sound units based on the respective unit overlap region of said sound units.
- View Dependent Claims (9, 10, 11, 12, 13, 14, 15)
- - 9. The method of claim 8 further comprising selectively altering the time duration of at least one of said unit overlap regions to match the time duration of another of said unit overlap regions prior to performing said merging step.
  - 10. The method of claim 8 wherein said statistical model is a Hidden Markov Model.
  - 11. The method of claim 8 wherein said statistical model is a recurrent neural network.
  - 12. The method of claim 8 wherein said speech signal parameters are include speech formants.
  - 13. The method of claim 8 wherein said statistical model has a data structure for separately modeling the nuclear trajectory region of a vowel and the transition elements surrounding said nuclear trajectory region.
  - 14. The method of claim 8 wherein the step of training said model is performed by embedded re-estimation to generate a converged model for alignment across the entire data set represented by said time-series data.
  - 15. The method of claim 8 wherein said statistical model has a data structure for separately modeling the nuclear trajectory region of a vowel, a first transition elements preceding said nuclear trajectory region and a second transition element following said nuclear trajectory region;
    - and

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sovereign Peak Ventures, LLC (Dominion Harbor Enterprises, LLC)
Original Assignee
Matsushita Electric Industrial Company Limited (Panasonic Holdings Corporation)
Inventors
Kibre, Nicholas, Pearson, Steve
Primary Examiner(s)
Hudspeth, David
Assistant Examiner(s)
Storm, Donald L.

Application Number

US09/264,981
Time in Patent Office

735 Days
Field of Search

704/265, 704/266, 704/267, 704/249, 704/254, 704/258
US Class Current

704/267
CPC Class Codes

G10L 13/07 Concatenation rules

Identification of unit overlap regions for concatenative speech synthesis system

First Claim

4 Assignments

0 Petitions

Accused Products

Abstract

Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

Identification of unit overlap regions for concatenative speech synthesis system

First Claim

4 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links