Expressivity of voice synthesis by emphasizing source signal features

US 6,804,649 B2
Filed: 06/01/2001
Issued: 10/12/2004
Est. Priority Date: 06/02/2000
Status: Expired due to Fees

First Claim

Patent Images

1. Voice synthesiser apparatus comprising:

a source module adapted to output, during use, a source signal;

a filter module arranged to receive said source signal as an input and to apply thereto a filter characteristic modelling the response of the vocal tract;

characterised in that the source module comprises a library of stored representations of source sound categories each corresponding to a respective morphological category, and that the source signal output by the source module corresponds to a stored representation of a selected source sound category;

wherein the source module comprises a resynthesis device adapted to output said source signal and that the stored representations in said library are in the form of resynthesis coefficients enabling said source sound categories to be regenerated by the resynthesis device;

wherein the stored representations in said library are derived by inverse filtering real vocal sounds so as to subtract the articulatory effects imposed by the vocal tract, and stored representations corresponding to a particular morphological category are derived by averaging signals that are produced by inverse filtering a plurality of examples of vocal sounds embodying the morphological category.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Voice synthesis with improved expressivity is obtained in a voice synthesiser of source-filter type by making use of a library of source sound categories in the source module. Each source sound category corresponds to a particular morphological category and is derived from analysis of real vocal sounds, by inverse filtering so as to subtract the effect of the vocal tract. The library may be parametrical, that is, the stored data corresponds not to the inverse-filtered sounds themselves but to synthesis coefficients for resynthesising the inverse-filtered sounds using any suitable re-synthesis technique, such as the phase vocoder technique. The coefficients are derived by Short Time Fourier Transform (STFT) analysis.

190 Citations

10 Claims

1. Voice synthesiser apparatus comprising:
- a source module adapted to output, during use, a source signal;
  
  a filter module arranged to receive said source signal as an input and to apply thereto a filter characteristic modelling the response of the vocal tract;
  
  characterised in that the source module comprises a library of stored representations of source sound categories each corresponding to a respective morphological category, and that the source signal output by the source module corresponds to a stored representation of a selected source sound category;
  
  wherein the source module comprises a resynthesis device adapted to output said source signal and that the stored representations in said library are in the form of resynthesis coefficients enabling said source sound categories to be regenerated by the resynthesis device;
  
  wherein the stored representations in said library are derived by inverse filtering real vocal sounds so as to subtract the articulatory effects imposed by the vocal tract, and stored representations corresponding to a particular morphological category are derived by averaging signals that are produced by inverse filtering a plurality of examples of vocal sounds embodying the morphological category.
- View Dependent Claims (2, 3, 4, 5)
- - 2. Voice synthesis apparatus according to claim 1, wherein the stored representations in said library are derived by deconvoluting respective portions of an utterance.
  - 3. Voice synthesis apparatus according to claim 1, wherein the resynthesis device comprises a phase vocoder adapted to output glottal signals for submission to said filter module, and the resynthesis coefficients constituting the stored representation of a source sound category correspond to a representation derived by STFT analysis of signals resulting from the inverse filtering.
  - 4. Voice synthesis apparatus according to claim 3, and comprising means for performing spectral transformations on said resynthesis coefficients, wherein the phase vocoder is driven by the transformed resynthesis coefficients.
  - 5. Voice synthesis apparatus according to claim 1, wherein the pitch of the source signal varies as a function of time, and there is provided means for transforming the source signal by modifying the pitch variation function, the filter module being adapted to operate on the source signal after transformation thereof by said transforming means.

6. A method of voice synthesis comprising the steps of:
- providing a source module, causing said source module to generate a source signal corresponding to a particular morphological category of sound, providing a filter module having a filter characteristic modelling the response of the vocal tract;
  
  inputting the source signal to the filter module, characterised in that the step of providing a source module comprises providing a source module comprising a library of stored representations of source sound categories each corresponding to a respective morphological category, and that the source signal output by the source module corresponds to a stored representation of a selected source sound category, wherein the source module outputs a source signal by retrieval from the library of a stored representation in the form of resynthesis coefficients representing the corresponding morphological category, input of the retrieved resynthesis coefficients to a resynthesis device, and output of the signal generated by the resynthesis device as the source signal, wherein the stored representations in said library are derived by inverse filtering real vocal sounds so as to subtract the articulatory effects imposed by the vocal tract, and stored representations corresponding to a particular morphological category are derived by averaging signals that are produced by inverse filtering a plurality of examples of vocal sounds embodying the morphological category.
- View Dependent Claims (7, 8, 9, 10)
- - 7. A voice synthesis method according to claim 6, wherein the stored representations in said library are derived by deconvoluting respective portions of an utterance.
  - 8. A voice synthesis method according to claim 6, wherein the resynthesis device comprises a phase vocoder adapted to output glottal signals to said filter module, and the resynthesis coefficients constituting the stored representation of a source sound category correspond to a representation derived by STFT analysis of signals resulting from the inverse filtering.
  - 9. A voice synthesis method according to claim 8, wherein a spectral transformation is applied to the retrieved resynthesis coefficients, and the transformed coefficients are used to drive the phase vocoder.
  - 10. A voice synthesis method according to claim 6, wherein the pitch of the source signal varies as a function of time, and comprising the step of transforming the source signal by modifying the pitch variation function, the filter module being adapted to operate on the source signal after transformation thereof in said transforming step.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sony France SA (Sony Group Corp.)
Original Assignee
Sony France SA (Sony Group Corp.)
Inventors
Miranda, Eduardo Reck
Primary Examiner(s)
Dorvil, Richemond
Assistant Examiner(s)
NOLAN, DANIEL A

Application Number

US09/872,966
Publication Number

US 20020026315A1
Time in Patent Office

1,229 Days
Field of Search

704/258, 704/211, 704/269, 704/200, 704/201, 704/261, 704/265, 704/206, 704/500, 704/503, 704/266
US Class Current

704/258
CPC Class Codes

G10L 13/04 Details of speech synthesis...

G10L 13/07 Concatenation rules

Expressivity of voice synthesis by emphasizing source signal features

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

190 Citations

10 Claims

Specification

Use Cases

Quick Links

Others

Expressivity of voice synthesis by emphasizing source signal features

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

190 Citations

10 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others