INTELLIGENT HUMAN-MACHINE CONVERSATION FRAMEWORK WITH SPEECH-TO-TEXT AND TEXT-TO-SPEECH
First Claim
1. A system comprising:
- a speech-to-text module to receive an input of speech including one or more words generated by a human and to output data including text, sentiment information, and other parameters information corresponding to the speech input;
a processing module to generate a reply to the speech input, the reply including a textual component, sentimental information associated with the textual component, and contextual information associated with the textual component; and
a text-to-speech module to receive the textual component, sentimental information, and contextual information of the reply and to generate, based on the received textual component and its associated sentimental information and contextual information of the reply, a speech output including one or more spoken words, the spoken words to be presented with at least one of a pace, a tone, a volume, an urgency, a rate, an accent pattern, and an emphasis representative of the sentimental information and contextual information associated with the textual component of the reply.
2 Assignments
0 Petitions
Accused Products
Abstract
A method, computer-readable medium, and system including a speech-to-text module to receive an input of speech including one or more words generated by a human and to output data including text, sentiment information, and other parameters corresponding to the speech input; a processing module like Artificial Intelligence to generate a reply to the speech input, the reply including a textual component, sentimental information associated with the textual component, and contextual information associated with the textual component; and a text-to-speech module to receive the textual component, sentimental information, and contextual information and to generate, based on the received textual component and its associated sentimental information and contextual information, a speech output including one or more spoken words, the spoken words to be presented with at least one of a pace, a tone, a volume, and an emphasis representative of the sentimental information and contextual information associated with the textual component.
25 Citations
18 Claims
-
1. A system comprising:
-
a speech-to-text module to receive an input of speech including one or more words generated by a human and to output data including text, sentiment information, and other parameters information corresponding to the speech input; a processing module to generate a reply to the speech input, the reply including a textual component, sentimental information associated with the textual component, and contextual information associated with the textual component; and a text-to-speech module to receive the textual component, sentimental information, and contextual information of the reply and to generate, based on the received textual component and its associated sentimental information and contextual information of the reply, a speech output including one or more spoken words, the spoken words to be presented with at least one of a pace, a tone, a volume, an urgency, a rate, an accent pattern, and an emphasis representative of the sentimental information and contextual information associated with the textual component of the reply. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer-implemented method, the method comprising:
-
receiving, by a processing module, speech input data derived from speech including one or more words generated by a human, the speech input data including text, sentiment information, and other parameters information corresponding to the speech; generating, by a processing module, a reply to the speech input data, the reply including a textual component, sentimental information associated with the textual component, and contextual information associated with the textual component; and transmitting, by a processing module, the textual component, sentimental information, and contextual information of the reply for the generation of, based on the textual component and its associated sentimental information and contextual information of the reply, a speech output including one or more spoken words, the spoken words to be presented with at least one of a pace, a tone, a volume, an urgency, a rate, an accent pattern, and an emphasis representative of the sentimental information and contextual information associated with the textual component of the reply. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. A non-transitory computer readable medium having processor-executable instructions stored thereon, the medium comprising:
-
instructions to receive speech input data derived from speech including one or more words generated by a human, the speech input data including text, sentiment information, and other parameters information corresponding to the speech; instructions to generate a reply to the speech input data, the reply including a textual component, sentimental information associated with the textual component, and contextual information associated with the textual component; and instructions to transmit the textual component, sentimental information, and contextual information of the reply for the generation of, based on the textual component and its associated sentimental information and contextual information of the reply, a speech output including one or more spoken words, the spoken words to be presented with at least one of a pace, a tone, a volume, an urgency, a rate, an accent pattern, and an emphasis representative of the sentimental information and contextual information associated with the textual component of the reply. - View Dependent Claims (17, 18)
-
Specification