TECHNOLOGY FOR RESPONDING TO REMARKS USING SPEECH SYNTHESIS
First Claim
1. A voice synthesis apparatus comprising:
- a voice input section configured to receive a voice signal of a remark;
a pitch analysis section configured to analyze a pitch of a first segment of the remark;
an acquisition section configured to acquire a reply to the remark; and
a voice generation section configured to generate voice of the reply acquired by said acquisition section, said voice generation section controlling a pitch of the voice of the reply in such a manner that a second segment of the reply has a pitch associated with the pitch of the first segment analyzed by said pitch analysis section,wherein said voice generation section controls the pitch of the voice of the reply in such a manner that an interval of the pitch of said second segment relative to the pitch of said first segment becomes a consonant interval.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention is provided with: a voice input section that receives a remark (a question) via a voice signal; a reply creation section that creates a voice sequence of a reply (response) to the remark; a pitch analysis section that analyzes the pitch of a first segment (e.g., word ending) of the remark; and a voice generation section (a voice synthesis section, etc.) that generates a reply, in the form of voice, represented by the voice sequence. The voice generation section controls the pitch of the entire reply in such a manner that the pitch of a second segment (e.g., word ending) of the reply assumes a predetermined pitch (e.g., five degrees down) with respect to the pitch of the first segment of the remark. Such arrangements can realize synthesis of replying voice capable of giving a natural feel to the user.
-
Citations
16 Claims
-
1. A voice synthesis apparatus comprising:
-
a voice input section configured to receive a voice signal of a remark; a pitch analysis section configured to analyze a pitch of a first segment of the remark; an acquisition section configured to acquire a reply to the remark; and a voice generation section configured to generate voice of the reply acquired by said acquisition section, said voice generation section controlling a pitch of the voice of the reply in such a manner that a second segment of the reply has a pitch associated with the pitch of the first segment analyzed by said pitch analysis section, wherein said voice generation section controls the pitch of the voice of the reply in such a manner that an interval of the pitch of said second segment relative to the pitch of said first segment becomes a consonant interval. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-implemented method comprising:
-
receiving a voice signal of a remark; analyzing a pitch of a first segment of the remark; acquiring a reply to the remark; synthesizing voice of the acquired reply; and controlling a pitch of the reply in such a manner that a pitch of a second segment of the voice of the reply has a pitch associated with the analyzed pitch of the first segment and an interval of the pitch of the second segment relative to the pitch of the first segment becomes a consonant interval.
-
-
13. A coding/decoding device comprising:
-
an A/D converter configured to convert an input voice signal of a remark into a digital signal; a pitch analysis section configured to analyze a pitch of a first segment of the remark based on the digital signal; a back-channel feedback acquisition section configured to, when back-channel feedback is to be returned to the remark, acquire back-channel feedback data corresponding to a meaning of the remark; a pitch control section configured to control a pitch of the back-channel feedback data in such a manner that a second segment of the back-channel feedback data has a pitch associated with the analyzed pitch of the first segment and an interval of the pitch of the second segment relative to the pitch of the first segment becomes a consonant interval; and a D/A converter configured to convert the pitch-controlled back-channel feedback data into an analogue signal. - View Dependent Claims (14)
-
-
15. A voice synthesis system comprising a coding/decoding device and a host computer, said coding/decoding device comprising:
-
an A/D converter that converts an input voice signal of a remark into a digital signal; a pitch analysis section that analyzes a pitch of a first segment of the remark based on the digital signal; a back-channel feedback acquisition section that, when back-channel feedback is to be returned to the remark, acquires back-channel feedback data corresponding to a meaning of the remark; a pitch control section that controls a pitch of the back-channel feedback data in such a manner that a second segment of the back-channel feedback data has a pitch associated with the analyzed pitch of the first segment and an interval of the pitch of the second segment relative to the pitch of the first segment becomes a consonant interval; and a D/A converter configured to convert the pitch-controlled back-channel feedback data into an analogue signal, wherein said host computer is configured in such a manner that, when replying voice other than the back-channel feedback is to be returned to the remark, said host computer acquires replying voice data, responsive to the remark, in accordance with the digital signal converted by said A/D converter and returns the acquired replying voice data to said coding/decoding device, wherein said pitch control section is further configured to control a pitch of the replying voice data in such a manner that a third segment of the replying voice data returned from the host computer has a pitch associated with the analyzed pitch of the first segment, and wherein said D/A converter is further configured to convert the pitch-controlled replying voice data, into an analogue signal.
-
-
16. A method comprising:
-
converting, by means of an A/D converter, an input voice signal of a remark into a digital signal; analyzing, by means of a processor, a pitch of a first segment of the remark based on the digital signal; acquiring, by means of the processor, back-channel feedback data corresponding to a meaning of the remark, when back-channel feedback is to be returned to the remark; controlling, by means of the processor, a pitch of the back-channel feedback data in such a manner that a second segment of the back-channel feedback data has a pitch associated with the analyzed pitch of the first segment and an interval of the pitch of the second segment relative to the pitch of the first segment becomes a consonant interval; and converting, by means of a D/A converter, the pitch-controlled back-channel feedback data into an analogue signal.
-
Specification