Speech-to-speech generation system and method

US 7,461,001 B2
Filed: 10/10/2003
Issued: 12/02/2008
Est. Priority Date: 04/11/2001
Status: Expired due to Fees

First Claim

Patent Images

1. A speech-to-speech generation method, comprising the steps of:

recognizing the speech of language A and creating the corresponding text of language A;

translating the text from language A to language B;

generating the speech of language B according to the text of language B,said speech-to-speech method is characterized by further comprising the steps of;

extracting expressive parameters from the speech of language A, said expressive parameters comprising pitch, volume and duration at a word level and intonation and sentence envelope at a sentence level;

obtaining normalized expressive parameters for language A based on a degree of variation of pitch, volume and duration at a word level and intonation and sentence envelope at a sentence level for words in a sentence and deriving relative expressive parameters from the normalized parameters;

comparing relative parameters of expressive speech with those of reference speech to identify varying relative parameters to be provided to said expressive parameter mapping means; and

mapping the identified varying relative parameters extracted by the detecting steps from language A to language B to obtain adjustment parameters for language B, and driving the text-to-speech generation process using the adjustment parameters mapping results to synthesized expressive speech in language B.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An expressive speech-to-speech generation system and method which can generate expressive speech output by using expressive parameters extracted from the original speech signal to drive the standard TTS system. The system comprises: speech recognition means, machine translation means, text-to-speech generation means, expressive parameter detection means for extracting expressive parameters from the speech of language A, and expressive parameter mapping means for mapping the expressive parameters extracted by the expressive parameter detection means from language A to language B, and driving the text-to-speech generation means by the mapping results to synthesize expressive speech. The system and method can improve the quality of the speech output of the translating system or TTS system.

27 Citations

View as Search Results

6 Claims

1. A speech-to-speech generation method, comprising the steps of:
- recognizing the speech of language A and creating the corresponding text of language A;
  
  translating the text from language A to language B;
  
  generating the speech of language B according to the text of language B,said speech-to-speech method is characterized by further comprising the steps of;
  
  extracting expressive parameters from the speech of language A, said expressive parameters comprising pitch, volume and duration at a word level and intonation and sentence envelope at a sentence level;
  
  obtaining normalized expressive parameters for language A based on a degree of variation of pitch, volume and duration at a word level and intonation and sentence envelope at a sentence level for words in a sentence and deriving relative expressive parameters from the normalized parameters;
  
  comparing relative parameters of expressive speech with those of reference speech to identify varying relative parameters to be provided to said expressive parameter mapping means; and
  
  mapping the identified varying relative parameters extracted by the detecting steps from language A to language B to obtain adjustment parameters for language B, and driving the text-to-speech generation process using the adjustment parameters mapping results to synthesized expressive speech in language B.
- View Dependent Claims (2, 3)
- - 2. A method according to claim 1, characterized in that said extracting further comprises extracting expressive parameters at the syllable level.
  - 3. A method according to claim 1, characterized in that mapping the varying relative parameters parameters from language A to language B, further comprises the step of converting the expressive parameters of language B, using word level converting tables and sentence level converting tables, into adjustment parameters for adjusting the text-to-speech generation means by word level converting and sentence level converting.

4. A speech-to-speech generation method, comprising the steps of:
- recognizing the speech of dialect A and creating the corresponding text;
  
  generating the speech of another dialect B according to the text, said speech-to-speech generation method is characterized by further comprising steps;
  
  extracting expressive parameters from the speech of dialect A, said expressive parameters comprising pitch, volume and duration at a word level and intonation and sentence envelope at a sentence level; and
  
  obtaining normalized expressive parameters for dialect A based on a degree of variation of pitch, volume and duration at a word level and intonation and sentence envelope at a sentence level for words in a sentence and deriving relative expressive parameters from the normalized parameters;
  
  comparing relative parameters of expressive speech with those of reference speech to identify varying relative parameters to be provided to said expressive parameters mapping means; and
  
  mapping the identified varying relative parameters from dialect A to dialect B to obtain adjustment parameters for language B, and driving the text-to-speech generating process using the adjustment parameters mapping results to synthesize expressive speech in dialect B.
- View Dependent Claims (5, 6)
- - 5. A method according to claim 4, characterized in that said extracting further comprises extracting expressive parameters at the syllable level.
  - 6. A method according to claim 4, characterized in that mapping the varying relative parameters from dialect A to dialect B, further comprises the step of converting the expressive parameters of dialect B, using word level converting tables and sentence level converting tables, into adjustment parameters for adjusting the text-to-speech generation means by word level converting and sentence level converting.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
International Business Machines Corporation
Original Assignee
International Business Machines Corporation
Inventors
Qin, Shi, Liqin, Shen, Wei, Zhang, Tang, Donald T.
Primary Examiner(s)
Hudspeth; David R.
Assistant Examiner(s)
Neway; Samuel G

Application Number

US10/683,335
Publication Number

US 20040172257A1
Time in Patent Office

1,880 Days
Field of Search

None
US Class Current

704/277
CPC Class Codes

G10L 13/00 Speech synthesis; Text to s...

G10L 13/04 Details of speech synthesis...

Speech-to-speech generation system and method

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

27 Citations

6 Claims

Specification

Solutions

Use Cases

Quick Links

Speech-to-speech generation system and method

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

27 Citations

6 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links