Speech synthesis apparatus and method

US 20020065659A1
Filed: 11/07/2001
Published: 05/30/2002
Est. Priority Date: 11/29/2000
Status: Abandoned Application

First Claim

Patent Images

1. A speech synthesis apparatus for synthesizing a speech in accordance with text data inputted therein, comprising:

text storage means for storing a plurality of recorded text data elements therein;

speech portion storage means for storing a plurality of recorded speech portions respectively corresponding to said recorded text data elements therein;

speech segment storage means for storing a plurality of speech segments;

text inputting means for inputting said text data;

judging means for disassembling said text data inputted by said text inputting means into a plurality of text data elements, judging whether or not said text data elements are identical to any one of said recorded text data elements stored in said text storage means one text data element after another;

dividing means for dividing said text data elements into two text portions consisting of a recorded text portion including recorded text data elements identical to said text data elements stored in said text storage means and a non-recorded text portion including non-recorded text data elements identical to said text data elements not stored in said text storage means on the basis of the results made by said judging means;

recorded speech loading means for inputting said recorded text portion including said recorded text data elements identical to said text data elements divided by said dividing means, and selectively loading recorded speech portions respectively corresponding to said recorded text data elements of said recorded text portion from among recorded speech portions stored in said speech portion storage means;

speech synthesizing means for inputting said non-recorded text portion including said non-recorded text data elements identical to said text data elements divided by said dividing means, and synthesizing said speech segments stored in said speech segment storage means in accordance with said non-recorded text data elements of said non-recorded text portion to generate synthesized speech portions;

reverberation property imparting means for imparting reverberation properties identical to those of said recorded speech portions stored in said speech portion storage means to said synthesized speech portions generated by said speech synthesizing means so as to construct synthesized speech portions with said reverberation properties;

speech overlapping means for overlapping said recorded speech portions loaded by said recorded speech loading means and said synthesized speech portions with said reverberation properties constructed by said reverberation property imparting means to generate said speech consisting of said recorded speech portions and said synthesized speech portions with reverberation properties; and

speech outputting means for outputting said speech consisting of said recorded speech portions and said synthesized speech portions with reverberation properties.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Herein disclosed a speech synthesis apparatus for and a speech synthesis method of synthesizing a speech in accordance with text data inputted therein to output a speech consisting of recorded speech portions and synthesized speech portions with reverberation properties identical to those of the recorded speech portions in which the synthesized speech portions with reverberation properties is substantially greater in the amplitude than the recorded speech portions to reduce a feeling of strangeness due to the difference in sound quality between the recorded speech portions and the synthesized speech portions.

Citations

6 Claims

1. A speech synthesis apparatus for synthesizing a speech in accordance with text data inputted therein, comprising:
- text storage means for storing a plurality of recorded text data elements therein;
  
  speech portion storage means for storing a plurality of recorded speech portions respectively corresponding to said recorded text data elements therein;
  
  speech segment storage means for storing a plurality of speech segments;
  
  text inputting means for inputting said text data;
  
  judging means for disassembling said text data inputted by said text inputting means into a plurality of text data elements, judging whether or not said text data elements are identical to any one of said recorded text data elements stored in said text storage means one text data element after another;
  
  dividing means for dividing said text data elements into two text portions consisting of a recorded text portion including recorded text data elements identical to said text data elements stored in said text storage means and a non-recorded text portion including non-recorded text data elements identical to said text data elements not stored in said text storage means on the basis of the results made by said judging means;
  
  recorded speech loading means for inputting said recorded text portion including said recorded text data elements identical to said text data elements divided by said dividing means, and selectively loading recorded speech portions respectively corresponding to said recorded text data elements of said recorded text portion from among recorded speech portions stored in said speech portion storage means;
  
  speech synthesizing means for inputting said non-recorded text portion including said non-recorded text data elements identical to said text data elements divided by said dividing means, and synthesizing said speech segments stored in said speech segment storage means in accordance with said non-recorded text data elements of said non-recorded text portion to generate synthesized speech portions;
  
  reverberation property imparting means for imparting reverberation properties identical to those of said recorded speech portions stored in said speech portion storage means to said synthesized speech portions generated by said speech synthesizing means so as to construct synthesized speech portions with said reverberation properties;
  
  speech overlapping means for overlapping said recorded speech portions loaded by said recorded speech loading means and said synthesized speech portions with said reverberation properties constructed by said reverberation property imparting means to generate said speech consisting of said recorded speech portions and said synthesized speech portions with reverberation properties; and
  
  speech outputting means for outputting said speech consisting of said recorded speech portions and said synthesized speech portions with reverberation properties.
- View Dependent Claims (2, 3)
- - 2. A speech synthesis apparatus as set forth in claim 1 further comprising noise measurement means for measuring a noise level in the environment in which said speech is audibly outputted, in which said reverberation property imparting means further includes amplitude adjusting means for adjusting the amplitude of said synthesized speech portions with said reverberation properties constructed by said reverberation property imparting means on the basis of said noise level measured by said noise measurement means and the amplitude of said recorded speech portions loaded by said recorded speech loading means to the degree that said synthesized speech portions with said reverberation properties is substantially greater in the amplitude than said recorded speech portions in proportion to said noise level;
    - whereby said speech overlapping means is operative to overlap said recorded speech portions loaded by said recorded speech loading means and said synthesized speech portions with said reverberation properties adjusted by said amplitude adjusting means to generate said speech consisting of said speech portions including said recorded speech portions and said synthesized speech portions with reverberation properties.
  - 3. A speech synthesis apparatus as set forth in claim 1 or 2 in which said speech segment storage means is operative to store a plurality of speech segments each including at least one phoneme, and divisible into a plurality of pitch waveforms, said speech segments respectively associated with said pitch waveforms with respect to said phonemes, and said speech synthesizing means is operative to synthesize said speech segments stored in said speech segment storage means by superimposing said pitch waveforms associated with said speech segments with respect to said phonemes in accordance with said non-recorded text data elements of said non-recorded text portion divided by said dividing means to generate synthesized speech portions.

4. A speech synthesis method of synthesizing a speech in accordance with text data inputted therein, comprising the steps of:
- (a) storing a plurality of recorded text data elements therein;
  
  (b) storing a plurality of recorded speech portions respectively corresponding to said recorded text data elements therein;
  
  (c) storing a plurality of speech segments;
  
  (d) inputting said text data;
  
  (e) disassembling said text data inputted in said step (d) into a plurality of text data elements, judging whether or not said text data elements are identical to any one of said recorded text data elements stored in said step (a) one text data element after another;
  
  (f) dividing said text data elements into two text portions consisting of a recorded text portion including recorded text data elements identical to said text data elements stored in said step (a) and a non-recorded text portion including non-recorded text data elements identical to said text data elements not stored in said step (a) on the basis of the results made in said step (e);
  
  (g) inputting said recorded text data portion including said recorded text data elements identical to said text data elements divided in said step (f), and selectively loading recorded speech portions respectively corresponding to said recorded text data elements of said recorded text portion from among recorded speech portions stored in said step (b);
  
  (h) inputting said non-recorded text data portion including said non-recorded text date elements identical to said text data elements divided in said step (f), and synthesizing said speech segments stored in said step (c) in accordance with said non-recorded text data elements of said non-recorded text portion to generate synthesized speech portions;
  
  (i) imparting reverberation properties identical to those of said recorded speech portions stored in said step (b) to said synthesized speech portions generated in said step (h) so as to construct synthesized speech portions with said reverberation properties;
  
  (j) overlapping said recorded speech portions loaded in said step (g) and said synthesized speech portions with said reverberation properties constructed in said step (i) to generate said speech consisting of said recorded speech portions and said synthesized speech portions with reverberation properties; and
  
  (k) outputting said speech consisting of said recorded speech portions and said synthesized speech portions with reverberation properties.
- View Dependent Claims (5, 6)
- - 5. A speech synthesis method as set forth in claim 4 further comprising the step of (l) measuring a noise level in the environment in which said speech is audibly outputted, in which said step (i) further includes the step of (i-1) adjusting the amplitude of said synthesized speech portions with said reverberation properties constructed in said step (i) on the basis of said noise level measured in said step (l) and the amplitude of said recorded speech portions loaded in said step (g) to the degree that said synthesized speech portions with said reverberation properties is substantially greater in the amplitude than said recorded speech portions in proportion to said noise level;
    - whereby said step (j) has the step of overlapping said recorded speech portions loaded in said step (g) and said synthesized speech portions with said reverberation properties adjusted in said step (i-1) to generate said speech consisting of said speech portions including said recorded speech portions and said synthesized speech portions with reverberation properties.
  - 6. A speech synthesis method as set forth in claim 4 or 5 in which said step (c) has the step of storing a plurality of speech segments each including at least one phoneme, and divisible into a plurality of pitch waveforms, said speech segments respectively associated with said pitch waveforms with respect to said phonemes, and said step (h) has the step of synthesizing said speech segments stored in said step (c) by superimposing said pitch waveforms associated with said speech segments with respect to said phonemes in accordance with said non-recorded text data elements of said non-recorded text portion divided in said step (f) to generate synthesized speech portions.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Matsushita Electric Industrial Company Limited (Panasonic Holdings Corporation)
Original Assignee
Matsushita Electric Industrial Company Limited (Panasonic Holdings Corporation)
Inventors
Nishimura, Hirofumi, Isono, Toshiyuki

Application Number

US10/045,512
Publication Number

US 20020065659A1
Time in Patent Office

Days
Field of Search
US Class Current

704/260
CPC Class Codes

G10L 13/04 Details of speech synthesis...

G10L 13/08 Text analysis or generation...

Speech synthesis apparatus and method

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

6 Claims

Specification

Solutions

Use Cases

Quick Links

Speech synthesis apparatus and method

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

6 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links