System, method and article of manufacture for detecting emotion in voice signals through analysis of a plurality of voice signal parameters
First Claim
1. A method for monitoring a conversation between a pair of speakers for detecting an emotion of at least one of the speakers using voice analysis comprising the steps of:
- (a) receiving a voice signal representing voices of speakers in a conversation;
(b) extracting at least one feature of the voice signal selected from a group of features consisting of a maximum value of a fundamental frequency, a standard deviation of the fundamental frequency, a range of the fundamental frequency, a mean of the fundamental frequency, a mean of a bandwidth of a first formant, a mean of a bandwidth of a second formant, a standard deviation of energy, a speaking rate, a slope of the fundamental frequency, a maximum value of the first formant, a maximum value of the energy, a range of the energy, a range of the second formant, and a range of the first formant;
(c) determining an emotion associated with the voice signal based on the extracted feature;
(d) determining whether the emotion matches a negative emotion selected from a predefined group of negative emotions consisting of anger, sadness and fear; and
(e) outputting the determined emotion to a third party during the conversation if the emotion matches one of the negative emotions.
4 Assignments
0 Petitions
Accused Products
Abstract
A method and system for monitoring a conversation between a pair of speakers for detecting an emotion of at least one of the speakers is provided. First, a voice signal is received after which a particular feature is extracted from the voice signal. Next, an emotion associated with the voice signal is determined based on the extracted feature. The emotion is screened and feedback is provided only if the emotion is determined to be a negative emotion selected from the group of negative emotions consisting of anger, sadness, and fear. Such determined negative emotion is then outputted to a third party during the conversation.
477 Citations
20 Claims
-
1. A method for monitoring a conversation between a pair of speakers for detecting an emotion of at least one of the speakers using voice analysis comprising the steps of:
-
(a) receiving a voice signal representing voices of speakers in a conversation; (b) extracting at least one feature of the voice signal selected from a group of features consisting of a maximum value of a fundamental frequency, a standard deviation of the fundamental frequency, a range of the fundamental frequency, a mean of the fundamental frequency, a mean of a bandwidth of a first formant, a mean of a bandwidth of a second formant, a standard deviation of energy, a speaking rate, a slope of the fundamental frequency, a maximum value of the first formant, a maximum value of the energy, a range of the energy, a range of the second formant, and a range of the first formant; (c) determining an emotion associated with the voice signal based on the extracted feature; (d) determining whether the emotion matches a negative emotion selected from a predefined group of negative emotions consisting of anger, sadness and fear; and (e) outputting the determined emotion to a third party during the conversation if the emotion matches one of the negative emotions. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer program embodied on a computer readable medium for monitoring a conversation between a pair of speakers for detecting an emotion of at least one of the speakers using voice analysis comprising:
-
(a) a code segment that receives a voice signal representing voices of speakers in a conversation; (b) a code segment that extracts at least one feature of the voice signal selected from a group of features consisting of a maximum value of a fundamental frequency, a standard deviation of the fundamental frequency, a range of the fundamental frequency, a mean of the fundamental frequency, a mean of a bandwidth of a first formant, a mean of a bandwidth of a second formant, a standard deviation of energy, a speaking rate, a slope of the fundamental frequency, a maximum value of the first formant, a maximum value of the energy, a range of the energy, a range of the second formant, and a range of the first formant; (c) a code segment that determines an emotion associated with the voice signal based on the extracted feature; (d) a code segment that determines whether the emotion matches a negative emotion selected from a predefined group of negative emotions consisting of anger, sadness and fear; and (e) a code segment that outputs the determined emotion to a third party during the conversation if the emotion matches one of the negative emotions. - View Dependent Claims (10, 11, 12, 13, 14)
-
-
15. A system for monitoring a conversation between a pair of speakers for detecting an emotion of at least one of the speakers using voice analysis comprising:
-
(a) logic that receives a voice signal representing voices of speakers in a conversation; (b) logic that extracts at least one feature of the voice signal selected from a group of features consisting of a maximum value of a fundamental frequency, a standard deviation of the fundamental frequency, a range of the fundamental frequency, a mean of the fundamental frequency, a mean of a bandwidth of a first formant, a mean of a bandwidth of a second formant, a standard deviation of energy, a speaking rate, a slope of the fundamental frequency, a maximum value of the first formant, a maximum value of the energy, a range of the energy, a range of the second formant, and a range of the first formant; (c) logic that determines an emotion associated with the voice signal based on the extracted feature; (d) a code segment that determines whether the emotion matches a negative emotion selected from a predefined group of negative emotions consisting of anger, sadness and fear; and (e) logic that outputs the determined emotion to a third party during the conversation if the emotion matches one of the negative emotions. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification