Technology for responding to remarks using speech synthesis
First Claim
1. A voice synthesis apparatus comprising:
- a voice input section configured to receive a voice signal of a remark;
a pitch analysis section configured to analyze a pitch of a first segment of the remark, wherein the first segment is a word ending of the remark;
an acquisition section configured to acquire a reply to the remark; and
a voice generation section configured to generate voice of the reply acquired by said acquisition section, said voice generation section shifting pitches of the entire voice waveform data of the reply by a same amount so that a second segment of the reply has a pitch associated with the pitch of the first segment analyzed by said pitch analysis section, wherein the second segment is a word beginning or word ending of the reply.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention is provided with: a voice input section that receives a remark (a question) via a voice signal; a reply creation section that creates a voice sequence of a reply (response) to the remark; a pitch analysis section that analyzes the pitch of a first segment (e.g., word ending) of the remark; and a voice generation section (a voice synthesis section, etc.) that generates a reply, in the form of voice, represented by the voice sequence. The voice generation section controls the pitch of the entire reply in such a manner that the pitch of a second segment (e.g., word ending) of the reply assumes a predetermined pitch (e.g., five degrees down) with respect to the pitch of the first segment of the remark. Such arrangements can realize synthesis of replying voice capable of giving a natural feel to the user.
-
Citations
17 Claims
-
1. A voice synthesis apparatus comprising:
-
a voice input section configured to receive a voice signal of a remark; a pitch analysis section configured to analyze a pitch of a first segment of the remark, wherein the first segment is a word ending of the remark; an acquisition section configured to acquire a reply to the remark; and a voice generation section configured to generate voice of the reply acquired by said acquisition section, said voice generation section shifting pitches of the entire voice waveform data of the reply by a same amount so that a second segment of the reply has a pitch associated with the pitch of the first segment analyzed by said pitch analysis section, wherein the second segment is a word beginning or word ending of the reply. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A computer-implemented method comprising:
-
receiving a voice signal of a remark; analyzing a pitch of a first segment of the remark, wherein the first segment is a word ending of the remark; acquiring a reply to the remark; synthesizing voice of the acquired reply; and shifting pitches of the entire voice waveform data of the reply by a same amount so that a pitch of a second segment of the voice of the reply has a pitch associated with the analyzed pitch of the first segment, wherein the second segment is a word beginning or word ending of the reply.
-
-
14. A coding/decoding device comprising:
-
an A/D converter configured to convert an input voice signal of a remark into a digital signal; a pitch analysis section configured to analyze a pitch of a first segment of the remark based on the digital signal, wherein the first segment is a word ending of the remark; a back-channel feedback acquisition section configured to, when back-channel feedback is to be returned to the remark, acquire back-channel feedback data corresponding to a meaning of the remark; a pitch control section configured to shift pitches of the entire back-channel feedback data by a same amount so that a second segment of the back-channel feedback data has a pitch associated with the analyzed pitch of the first segment, wherein the second segment is a word beginning or word ending of the back-channel feedback data; and a D/A converter configured to convert the pitch-shifted back-channel feedback data into an analog signal. - View Dependent Claims (15)
-
-
16. A voice synthesis system comprising a coding/decoding device and a host computer, said coding/decoding device comprising:
-
an A/D converter configured to convert an input voice signal of a remark into a digital signal; a pitch analysis section configured to analyze a pitch of a first segment of the remark based on the digital signal, wherein the first segment is a word ending of the remark; a back-channel feedback acquisition section configured to, when back-channel feedback is to be returned to the remark, acquire back-channel feedback data corresponding to a meaning of the remark; a pitch control section configured to shift pitches of the entire back-channel feedback data by a same amount so that a second segment of the back-channel feedback data has a pitch associated with the analyzed pitch of the first segment, wherein the second segment is a word beginning or word ending of the back-channel feedback data; and a D/A converter configured to convert the pitch-shifted back-channel feedback data into an analog signal, wherein said host computer is configured to, when replying voice other than the back-channel feedback is to be returned to the remark, acquire replying voice data, responsive to the remark, in accordance with the digital signal converted by said A/D converter and return the acquired replying voice data to said coding/decoding device, wherein said pitch control section is further configured to shift pitches of the replying voice data by a same amount so that a third segment of the replying voice data returned from the host computer has a pitch associated with the analyzed pitch of the first segment, and wherein said D/A converter is further configured to convert the pitch-shifted replying voice data, into an analog signal.
-
-
17. A method comprising:
-
converting, by means of an A/D converter, an input voice signal of a remark into a digital signal; analyzing, by means of a processor, a pitch of a first segment of the remark based on the digital signal, wherein the first segment is a word ending of the remark; acquiring, by means of the processor, back-channel feedback data corresponding to a meaning of the remark, when back-channel feedback is to be returned to the remark; shifting, by means of the processor, pitches of the entire back-channel feedback data by a same amount so that a second segment of the back-channel feedback data has a pitch associated with the analyzed pitch of the first segment, wherein the second segment is a word beginning or word ending of the back-channel feedback data; and converting, by means of a D/A converter, the pitch-shifted back-channel feedback data into an analog signal.
-
Specification