Speech recognition system, speech recognition method, speech synthesis system, speech synthesis method, and program product

US 7,680,666 B2
Filed: 12/01/2006
Issued: 03/16/2010
Est. Priority Date: 03/04/2002
Status: Active Grant

First Claim

Patent Images

1. A speech synthesis system, comprising:

a sound signal acquirer configured to acquire a sound signal including a speech signal vocalized by a speaker and a noise signal;

a speech recognizer configured to recognize speech vocalized by the speaker in the speech signal included in the sound signal;

a first spectrum generator configured to generate a spectrum of the sound signal acquired by the sound signal acquirer as a first spectrum;

a second spectrum generator configured to generate a second spectrum, based on features of localized phonemes recognized by the speech recognizer so that the second spectrum does not contain a spectrum of the noise signal;

a modified spectrum generator configured to generate a modified spectrum by multiplying the first spectrum by the second spectrum; and

an outputter configured to output a synthesized speech signal of the vocalized speech based on the modified spectrum.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The object of the present invention is to keep a high success rate in recognition with a low-volume of sound signal, without being affected by noise.

The speech recognition system comprises a sound signal processor 10 configured to acquire a sound signal, and to calculate a sound signal parameter based on the acquired sound signal; an electromyographic signal processor 13 configured to acquire potential changes on a surface of the object as an electromyographic signal, and to calculate an electromyographic signal parameter based on the acquired electromyographic signal; an image information processor 16 configured to acquire image information by taking an image of the object, and to calculate an image information parameter based on the acquired image information; a speech recognizer 20 configured to recognize a speech signal vocalized by the object, based on the sound signal parameter, the electromyographic signal parameter and the image information parameter; and a recognition result provider 21 configured to provide a result recognized by the speech recognizer 20.

25 Citations

View as Search Results

7 Claims

1. A speech synthesis system, comprising:
- a sound signal acquirer configured to acquire a sound signal including a speech signal vocalized by a speaker and a noise signal;
  
  a speech recognizer configured to recognize speech vocalized by the speaker in the speech signal included in the sound signal;
  
  a first spectrum generator configured to generate a spectrum of the sound signal acquired by the sound signal acquirer as a first spectrum;
  
  a second spectrum generator configured to generate a second spectrum, based on features of localized phonemes recognized by the speech recognizer so that the second spectrum does not contain a spectrum of the noise signal;
  
  a modified spectrum generator configured to generate a modified spectrum by multiplying the first spectrum by the second spectrum; and
  
  an outputter configured to output a synthesized speech signal of the vocalized speech based on the modified spectrum.
- View Dependent Claims (2, 3)
- - 2. The speech synthesis system according to claim 1, wherein the outputter comprises a communicator configured to transmit the synthesized speech signal as data.
  - 3. The speech synthesis system according to claim 1, further comprising an electromyographic signal processor configured to detect and process motion of muscles around a mouth of the speaker when the sound signal is vocalized.

4. A speech synthesis method comprising:
- acquiring a sound signal including a speech signal vocalized by a speaker and a noise signal;
  
  recognizing speech vocalized by the speaker in the speech signal included in the sound signal;
  
  generating a spectrum of the acquired sound signal as a first spectrum;
  
  generating a second spectrum based on features of localized phonemes of the recognized speech signal so that the second spectrum does not contain a spectrum of the noise signal;
  
  generating a modified spectrum by multiplying the first spectrum by the second spectrum; and
  
  outputting a synthesized speech signal of the vocalized speech based on the modified spectrum.
- View Dependent Claims (5)
- - 5. The speech synthesis method according to claim 4, further comprising detecting and processing motion of muscles around a mouth of the speaker when the sound signal is vocalized.

6. A computer readable storage medium storing computer executable instructions which when executed by a processor, causes the processor to perform a method of synthesizing a speech signal, the method comprising the steps of:
- acquiring a sound signal including a speech signal vocalized by a speaker and a noise signal;
  
  recognizing speech vocalized by the speaker in the speech signal included in the sound signal;
  
  generating a spectrum of the acquired sound signal as a first spectrum;
  
  generating a second spectrum based on the features of the localized phonemes of the recognized speech signal so that the second spectrum does not contain a spectrum of the noise signal;
  
  generating a modified spectrum by multiplying the first spectrum by the second spectrum; and
  
  outputting a synthesized speech signal of the vocalized speech based on the modified spectrum.
- View Dependent Claims (7)
- - 7. The computer readable storage medium according to claim 6, wherein the method further comprises detecting and processing motion of muscles around a mouth of the speaker when the sound signal is vocalized.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
NTT Docomo Incorporated (Nippon Telegraph and Telephone Corporation)
Original Assignee
NTT Docomo Incorporated (Nippon Telegraph and Telephone Corporation)
Inventors
Hiraiwa, Akira, Manabe, Hiroyuki, Sugimura, Toshiaki
Primary Examiner(s)
Opsasnick; Michael N

Application Number

US11/565,992
Publication Number

US 20070100630A1
Time in Patent Office

1,201 Days
Field of Search

704/269, 704/226, 704/201
US Class Current

704/267
CPC Class Codes

G06F 18/256   of results relating to diff...

G06V 10/811   the classifiers operating o...

G06V 40/20   Movements or behaviour, e.g...

G10L 13/033   Voice editing, e.g. manipul...

G10L 15/24   Speech recognition using no...

G10L 2021/0135   Voice conversion or morphing

Speech recognition system, speech recognition method, speech synthesis system, speech synthesis method, and program product

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

25 Citations

7 Claims

Specification

Solutions

Use Cases

Quick Links

Speech recognition system, speech recognition method, speech synthesis system, speech synthesis method, and program product

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

25 Citations

7 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links