Speech synthesis for tasks with word and prosody dictionaries

US 6,826,530 B1
Filed: 07/21/2000
Issued: 11/30/2004
Est. Priority Date: 07/21/1999
Status: Expired due to Fees

First Claim

Patent Images

1. A speech synthesizing method using word dictionaries, prosody dictionaries, and waveform dictionaries corresponding to a plurality of tasks of a speech synthesizing process in which at least one of speakers, emotion or situation when speeches are made, and contents of the speeches is different, comprising the steps of:

switching among a word dictionary, a prosody dictionary, and a waveform dictionary according to designation of a task to be input together with a character string to be synthesized; and

synthesizing a speech message corresponding to a character string to be synthesized by using the switched word dictionary, prosody dictionary, and waveform dictionary, each dictionary including;

(a) a word dictionary including a number of words, each having at least one character, together with respective accent types, (b) a prosody dictionary including typical prosody model data in prosody model data indicating prosody of words in the word dictionary, and (c) a waveform dictionary including recorded speeches as speech data in synthesis units, the speech synthesizing process comprising the steps of;

determining an accent type of a character string to be synthesized from the word dictionary;

selecting prosody model data from the prosody dictionary based on the character string to be synthesized and the accent type;

selecting waveform data corresponding to each character of the character string to be synthesized based on the selected prosody model data from the waveform dictionary; and

connecting selected pieces of waveform data.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A plurality of tasks are set in a speech synthesizing process, in which at least one of speakers, emotion or situation at the time speeches are made, and contents of the speeches, is different, and word dictionaries, prosody dictionaries, and waveform dictionaries corresponding to respective tasks are organized. When a character string to be synthesized is input with the task specified through, for example, a game system, a speech synthesizing process is performed using the word dictionary, the prosody dictionary, and the waveform dictionary corresponding to the specified task. Therefore, a speech message can be generated depending on the personality of a speaker, the emotion or situation at the time when a speech is made, and the contents of the speech.

Citations

9 Claims

1. A speech synthesizing method using word dictionaries, prosody dictionaries, and waveform dictionaries corresponding to a plurality of tasks of a speech synthesizing process in which at least one of speakers, emotion or situation when speeches are made, and contents of the speeches is different, comprising the steps of:
- switching among a word dictionary, a prosody dictionary, and a waveform dictionary according to designation of a task to be input together with a character string to be synthesized; and
  
  synthesizing a speech message corresponding to a character string to be synthesized by using the switched word dictionary, prosody dictionary, and waveform dictionary, each dictionary including;
  
  (a) a word dictionary including a number of words, each having at least one character, together with respective accent types, (b) a prosody dictionary including typical prosody model data in prosody model data indicating prosody of words in the word dictionary, and (c) a waveform dictionary including recorded speeches as speech data in synthesis units, the speech synthesizing process comprising the steps of;
  
  determining an accent type of a character string to be synthesized from the word dictionary;
  
  selecting prosody model data from the prosody dictionary based on the character string to be synthesized and the accent type;
  
  selecting waveform data corresponding to each character of the character string to be synthesized based on the selected prosody model data from the waveform dictionary; and
  
  connecting selected pieces of waveform data.

2. A speech synthesizing method using word dictionaries, prosody dictionaries, waveform dictionaries, and word variation rules corresponding to a plurality of tasks of a speech synthesizing process in which at least one of speakers, emotion or situation when speeches are made, and contents of the speeches is different, comprising the steps of:
- switching among a word dictionary, a prosody dictionary, a waveform dictionary, arid word variation rules according to designation of a task to be input together with a character string to be synthesized;
  
  varying the character string to be synthesized according to the word variation rules; and
  
  synthesizing a speech message corresponding to the varied character string by using the switched word dictionary, prosody dictionary, and waveform dictionary, each dictionary including;
  
  (a) a word dictionary including a number of words, each having at least one character, together with respective accent types, (b) a prosody dictionary including a typical prosody model data in prosody model data indicating prosody of words in the word dictionary, (c) a waveform dictionary including recorded speeches as speech data in synthesis units, and (d) word variation rules for recording variation rules of character strings, the speech synthesizing process comprising the steps of;
  
  determining an accent type of a character string to be synthesized from the word dictionary or the word variation rules;
  
  selecting prosody model data from the prosody dictionary based on the character string to be synthesized and the accent type;
  
  selecting waveform data corresponding to each character of the character string to be synthesized based on the selected prosody model data from the waveform dictionary; and
  
  connecting selected pieces of waveform data.

3. A speech synthesizing method using a word dictionary and using prosody dictionaries, waveform dictionaries, and word variation rules corresponding to each of a plurality of tasks of a speech synthesizing process in which any of speakers, emotion when speeches are made, and situation when speeches are made is different, comprising the steps of:
- switching among a prosody dictionary, a waveform dictionary, and word variation rules according to designation of a task to be input together with a character string to be synthesized;
  
  varying the character string to be synthesized according to the word variation rules; and
  
  synthesizing a speech message corresponding to the varied character string by using a word dictionary, the switched prosody dictionary and waveform dictionary, each dictionary including;
  
  (a) a word dictionary including a number of words, each having at least one character, together with respective accent types, (b) a prosody dictionary including a typical prosody model data in prosody model data indicating prosody of words in the word dictionary, (c) a waveform dictionary including recorded speeches as speech data in synthesis units, and (d) word variation rules for recording variation rules of character strings, the speech synthesizing process comprising the steps of;
  
  determining an accent type of a character string to be synthesized from the word dictionary or the word variation rules;
  
  selecting prosody model data from the prosody dictionary based on the character string to be synthesized and the accent type;
  
  selecting waveform data corresponding to each character of the character string to be synthesized based on the selected prosody model data from the waveform dictionary; and
  
  connecting selected pieces of waveform data.

4. A speech synthesis apparatus using word dictionaries, prosody dictionaries, and waveform dictionaries corresponding to a plurality of tasks of a speech synthesizing process in which at least one of speakers, emotion or situation when speeches are made, and contents of the speeches is different, comprising:
- switches for switching among a word dictionary, a prosody dictionary, and a waveform dictionary according to designation of a task to be input together with a character string to be synthesized; and
  
  a synthesizer for synthesizing a speech message corresponding to a character string to be synthesized by using the switched word dictionary, prosody dictionary, and waveform dictionary, each dictionary including;
  
  (a) a word dictionary including a number of words, each having at least one character, together with respective accent types, (b) a prosody dictionary including a typical prosody model data in prosody model data indicating prosody of words in the word dictionary, and (c) a waveform dictionary including recorded speeches as speech data in synthesis units, a speech synthesizing processor being arranged for;
  
  (a) determining an accent type of a character string to be synthesized from the word dictionary;
  
  (b) selecting prosody model data from the prosody dictionary based on the character string to be synthesized and the accent type;
  
  (c) selecting waveform data corresponding to each character of the character string to be synthesized based on the selected prosody model data from the waveform dictionary; and
  
  (d) connecting selected pieces of waveform data.

5. A speech synthesis apparatus using word dictionaries, prosody dictionaries, waveform dictionaries, and word variation rules corresponding to a plurality of tasks of a speech synthesizing process in which at least one of speakers, emotion or situation when speeches are made, and contents of the speeches is different, comprising:
- switches for switching among a word dictionary, a prosody dictionary, a waveform dictionary, and word variation rules according to designation of a task to be input together with a character string to be synthesized;
  
  a processor arrangement for varying the character string to be synthesized according to the word variation rules; and
  
  a synthesizer for synthesizing a speech message corresponding to the varied character string by using the switched word dictionary, prosody dictionary, and waveform dictionary, each dictionary including;
  
  (a) a word dictionary including a number of words, each having at least one character, together with respective accent types, (b) a prosody dictionary including a typical prosody model data in prosody model data indicating prosody of words in the word dictionary, (c) a waveform dictionary including recorded speeches as speech data in synthesis units, and (d) word variation rules for recording variation rules of character strings, a speech synthesizing processor being arranged for;
  
  (a) determining an accent type of a character string to be synthesized from the word dictionary or the word variation rules;
  
  (b) selecting prosody model data from the prosody dictionary based on the character string to be synthesized and the accent type;
  
  (c) selecting waveform data corresponding to each character of the character string to be synthesized based on the selected prosody model data from the waveform dictionary; and
  
  (d) connecting selected pieces of waveform data.

6. A speech synthesis apparatus using a word dictionary and using prosody dictionaries, waveform dictionaries, and word variation rules corresponding to each of a plurality of tasks of a speech synthesizing process in which any of speakers, emotion when speeches are made, and situation when speeches are made is different, comprising:
- switches for switching among a prosody dictionary, a waveform dictionary, and word variation rules according to designation of a task to be input together with a character string to be synthesized;
  
  a processor arrangement for varying the character string to be synthesized according to the word variation rules; and
  
  a synthesizer for synthesizing a speech message corresponding to the varied character string by using a word dictionary, the switched prosody dictionary and waveform dictionary, each dictionary including;
  
  (a) a word dictionary including a number of words, each having at least one character, together with respective accent types, (b) a prosody dictionary including a typical prosody model data in prosody model data indicating prosody of words in the word dictionary, (c) a waveform dictionary including recorded speeches as speech data in synthesis units, and (d) word variation rules for recording variation rules of character strings, a speech synthesizing processor being arranged for;
  
  (a) determining an accent type of a character string to be synthesized from the word dictionary or the word variation rules;
  
  (b) selecting prosody model data from the prosody dictionary based on the character string to be synthesized and the accent type;
  
  (c) selecting waveform data corresponding to each character of the character string to be synthesized based on the selected prosody model data from the waveform dictionary; and
  
  (d) connecting selected pieces of waveform data.

7. A computer-readable medium storing a speech synthesis program used to direct a computer to function as:
- word dictionaries, prosody dictionaries, and waveform dictionaries corresponding to a plurality of tasks of a speech synthesizing process in which at least one of speakers, emotion or situation when speeches are made, and contents of the speeches is different;
  
  switches for switching among a word dictionary, a prosody dictionary, and a waveform dictionary according to designation of a task to be input together with a character string to be synthesized;
  
  a synthesizer for synthesizing a speech message corresponding to a character string to be synthesized by using the switched word dictionary, prosody dictionary, and waveform dictionary, each dictionary including;
  
  (a) a word dictionary including a number of words, each having at least one character, together with respective accent types, (b) a prosody dictionary including a typical prosody model data in prosody model data indicating prosody of words contained in the word dictionary, and (c) a waveform dictionary including recorded speeches as speech data in synthesis units; and
  
  a speech synthesizing processor being arranged for;
  
  (a) determining an accent type of a character string to be synthesized from the word dictionary;
  
  (b) selecting prosody model data from the prosody dictionary based on the character string to be synthesized and the accent type;
  
  (c) selecting waveform data corresponding to each character of the character string to be synthesized based on the selected prosody model data from the waveform dictionary; and
  
  (d) connecting selected pieces of waveform data.

8. A computer-readable medium storing a speech synthesis program used to direct a computer to function as:
- word dictionaries, prosody dictionaries, waveform dictionaries, and word variation rules corresponding to a plurality of tasks of a speech synthesizing process in which at least one of speakers, emotion or situation when speeches are made, and contents of the speeches is different, the program causing the computer to;
  
  (a) switch among at least one of the word dictionaries, prosody dictionaries, waveform dictionaries, and word variation rules according to designation of a task to be input together with a character string to be synthesized;
  
  (b) vary the character string to be synthesized according to the word variation rules; and
  
  (c) synthesize a speech message corresponding to the varied character string by using the switched word dictionary, prosody dictionary, and waveform dictionary, each dictionary including;
  
  (i) a word dictionary including a number of words, each having at least one character, together with respective accent types, (ii) a prosody dictionary including a typical prosody model data in prosody model data indicating prosody of words contained in the word dictionary, (iii) a waveform dictionary including recorded speeches as speech data in synthesis units, and (iv) word variation rules for recording variation rules of character strings;
  
  (d) determine an accent type of a character string to be synthesized from the word dictionary or the word variation rules;
  
  (e) select prosody model data from the prosody dictionary based on the character string to be synthesized and the accent type;
  
  (f) select waveform data corresponding to each character of the character string to be synthesized based on the selected prosody model data from the waveform dictionary; and
  
  (g) connect selected pieces of waveform data.

9. A computer-readable medium storing a speech synthesis program used to direct a computer to function as:
- a word dictionary;
  
  prosody dictionaries, waveform dictionaries, and word variation rules corresponding to each of a plurality of tasks of a speech synthesizing process in which any of speakers, emotion when speeches are made, and situation when speeches are made is different;
  
  switches for switching among a prosody dictionary, a waveform dictionary, and word variation rules according to designation of a task to be input together with a character string to be synthesized;
  
  a processor arrangement for varying the character string to be synthesized according to the word variation rules; and
  
  a synthesizer for synthesizing a speech message corresponding to the varied character string by using a word dictionary, the switched prosody dictionary and waveform dictionary, each dictionary including;
  
  (a) a word dictionary including a number of words, each having at least one character, together with respective accent types, (b) a prosody dictionary including a typical prosody model data in prosody model data indicating prosody of words contained in the word dictionary, (c) a waveform dictionary including recorded speeches as speech data in synthesis units, and (d) word variation rules for recording variation rules of character strings;
  
  a speech synthesizing processor being arranged for;
  
  (a) determining an accent type of a character string to be synthesized from the word dictionary or the word variation rules;
  
  (b) selecting prosody model data from the prosody dictionary based on the character string to be synthesized and the accent type;
  
  (c) selecting waveform data corresponding to each character of the character string to be synthesized based on the selected prosody model data from the waveform dictionary; and
  
  (d) connecting selected pieces of waveform data.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Konami Corporation (Konami Holdings Corp.), Konami Osaka Computer Entertainment Company Limited
Original Assignee
Konami Computer Entertainment Tokyo, Inc. (Konami Holdings Corp.), Konami Corporation (Konami Holdings Corp.)
Inventors
Mizoguchi, Toshiyuki, Kasai, Osamu
Primary Examiner(s)
Dorvil, Richemond
Assistant Examiner(s)
Lerner, Martin

Application Number

US09/621,544
Time in Patent Office

1,593 Days
Field of Search

704/1, 704/10, 704/258, 704/266, 704/260, 704/261, 704/268, 704/269
US Class Current

704/258
CPC Class Codes

A63F 2300/6063 for sound processing

G10L 13/047 Architecture of speech synt...

Speech synthesis for tasks with word and prosody dictionaries

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

9 Claims

Specification

Solutions

Use Cases

Quick Links

Speech synthesis for tasks with word and prosody dictionaries

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

9 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links