SPEECH PROCESSING APPARATUS AND METHOD

US 20090018837A1
Filed: 07/09/2008
Published: 01/15/2009
Est. Priority Date: 07/11/2007
Status: Active Grant

First Claim

Patent Images

1. A speech processing apparatus which is configured to playback a sentence including a plurality of words or phrases using recorded-speech-playback or text-to-speech as a speech synthesis method, the apparatus comprising:

a determining unit configured to determine whether each of a plurality of words or phrases constituting a sentence is a word or phrase to be played back by recorded-speech-playback or a word or phrase to be played back by text-to-speech;

a selection unit configured to select whether to playback each of the plurality of words or phrases in a first sequence or a sequence different from the first sequence, based on the number of times of reversing playback using recorded-speech-playback and playback using text-to-speech, when each of the plurality of words or phrases is to be played back in the first sequence using a synthesis method specified by said determining unit; and

a playback unit configured to playback each of the plurality of words or phrases in a sequence selected by said selection unit using a synthesis method specified by said determining unit.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A speech processing apparatus which can playback a sentence using recorded-speech-playback or text-to-speech is provided. It is determined whether each of a plurality of words or phrases constituting a sentence is a word or phrase to be played back by recorded-speech-playback or a word or phrase to be played back by text-to-speech. When each of the plurality of words or phrases is to be played back in a first sequence using the determined synthesis method, it is selected whether to playback each of the plurality of words or phrases in the first sequence or a sequence different from the first sequence, based on the number of times of reversing playback using recorded-speech-playback and playback using text-to-speech. Each of the plurality of words or phrases is played back in the selected sequence using the selected synthesis method.

38 Citations

View as Search Results

9 Claims

1. A speech processing apparatus which is configured to playback a sentence including a plurality of words or phrases using recorded-speech-playback or text-to-speech as a speech synthesis method, the apparatus comprising:
- a determining unit configured to determine whether each of a plurality of words or phrases constituting a sentence is a word or phrase to be played back by recorded-speech-playback or a word or phrase to be played back by text-to-speech;
  
  a selection unit configured to select whether to playback each of the plurality of words or phrases in a first sequence or a sequence different from the first sequence, based on the number of times of reversing playback using recorded-speech-playback and playback using text-to-speech, when each of the plurality of words or phrases is to be played back in the first sequence using a synthesis method specified by said determining unit; and
  
  a playback unit configured to playback each of the plurality of words or phrases in a sequence selected by said selection unit using a synthesis method specified by said determining unit.
- View Dependent Claims (2, 3, 4, 5)
- - 2. The apparatus according to claim 1, wherein the number of times of reversing is equivalent to a sum of the number of times of changing from playback using recorded-speech-playback to playback using text-to-speech and the number of times of changing from playback using text-to-speech to playback using recorded-speech-playback.
  - 3. The apparatus according to claim 1, wherein said selection unit selects playback in the first sequence if the number of times of reversing is less than a predetermined number, and selects playback in a sequence different from the first sequence otherwise.
  - 4. The apparatus according to claim 1, wherein said selection unit selects playback in the first sequence when the number of times of reversing is less than a predetermined number, and selects playback in one of a plurality of sequences different from the first sequence based on a predetermined reference when the number of times of reversing is not less than the predetermined number.
  - 5. The apparatus according to claim 4, wherein said selection unit selects playback in a sequence, of a plurality of sequences different from the first sequence, in which the number of times of changing playback using the recorded-speech-playback and playback using the text-to-speech becomes not less than the predetermined number, when the number of times of reversing is not less than the predetermined number.

6. A speech processing apparatus which generates guidance speech corresponding to user operation using a speech synthesis unit configured to perform speech synthesis while selectively changing recorded-speech-playback and text-to-speech, the apparatus comprising:
- a guidance holding unit configured to hold a first guidance including fixed portions indicating fixed messages and a variable portion which is located between the fixed portions and indicates that a message corresponding to user operation is inserted, and a second guidance which has the variable portion located at the end of a fixed portion and is synonymous with the first guidance;
  
  an entry holding unit configured to hold a set of entries in which spellings, pronunciations of the spellings, and pieces of speech based on the pronunciations which are associated with user operation are configured to be registered; and
  
  an acquisition unit configured to acquire an entry corresponding to operation performed by a user from said entry holding unit,wherein when speech is registered in an entry acquired by said acquisition unit, said speech synthesis unit selects the first guidance, performs speech synthesis of a fixed portion of the first guidance by recorded-speech-playback using recorded speech corresponding to the fixed portion, and performs speech synthesis of a variable portion by recorded-speech-playback using speech registered in the entry, andwhen no speech is registered in an entry acquired by said acquisition unit, selects the second guidance, performs speech synthesis of a fixed portion of the second guidance by recorded-speech-playback using recorded speech corresponding to the fixed portion, and performs speech synthesis of a variable portion by text-to-speech.
- View Dependent Claims (7)
- - 7. The apparatus according to claim 6, further comprising a communication unit configured to perform network communication,wherein the user operation includes operation associated with the network communication, and said entry holding unit comprises an address book for the network communication.

8. A speech processing method of generating guidance speech corresponding to user operation by controlling a speech processing apparatus having a guidance holding unit configured to hold a first guidance including fixed portions indicating fixed messages and a variable portion which is located between the fixed portions and indicates that a message corresponding to user operation is inserted, and a second guidance which has the variable portion located at the end of a fixed portion and is synonymous with the first guidance, an entry holding unit configured to hold a set of entries in which spellings, pronunciations of the spellings, and pieces of speech based on the pronunciations which are associated with user operation are configured to be registered, and a speech synthesis unit configured to perform speech synthesis while selectively changing recorded- speech-playback and text-to-speech, the method comprising the steps of:
- acquiring an entry corresponding to operation performed by a user from the entry holding unit;
  
  when speech is registered in the acquired entry, selecting the first guidance, performing speech synthesis of a fixed portion of the first guidance by recorded-speech-playback using recorded speech corresponding to the fixed portion, and performing speech synthesis of a variable portion by recorded-speech-playback using speech registered in the entry; and
  
  when no speech is registered in the acquired entry, selecting the second guidance, performing speech synthesis of a fixed portion of the second guidance by recorded-speech-playback using recorded speech corresponding to the fixed portion, and performing speech synthesis of a variable portion by text-to-speech.
- View Dependent Claims (9)
- - 9. A computer-readable storage medium having stored thereon a computer program for causing a computer to execute a speech processing method defined in claim 8.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Canon Kabushiki Kaisha (Canon Inc.)
Original Assignee
Canon Kabushiki Kaisha (Canon Inc.)
Inventors
Aizawa, Michio

Granted Patent

US 8,027,835 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/260
CPC Class Codes

G10L 13/02 Methods for producing synth...

G10L 13/08 Text analysis or generation...

SPEECH PROCESSING APPARATUS AND METHOD

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

38 Citations

9 Claims

Specification

Solutions

Use Cases

Quick Links

SPEECH PROCESSING APPARATUS AND METHOD

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

38 Citations

9 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links