Emotion detecting method, emotion detecting apparatus, emotion detecting program that implements the same method, and storage medium that stores the same program
First Claim
Patent Images
1. An emotion detecting method that performs an emotion detecting processing based on an audio feature of input audio signal data, comprising:
- an audio feature extracting step of extracting, as an audio feature vector, one or more of a fundamental frequency, a sequence of a temporal variation characteristic of the fundamental frequency, a power, a sequence of a temporal variation characteristic of the power, and a temporal variation characteristic of a speech rate from the audio signal data for each analysis frame, and storing the audio feature vector in a storage part;
an audio feature appearance probability calculating step of reading the audio feature vector for each analysis frame and calculating the audio feature appearance probability that the audio feature vector appears on condition of sequences of predetermined emotional states corresponding to one or more types of emotions using a first statistical model constructed based on previously input learning audio signal data;
an emotional state transition probability calculating step of calculating the probability of temporal transition of sequences of the predetermined emotional states as the emotional state transition probability using a second statistical model;
an emotional state probability calculating step of calculating the emotional state probability based on the audio feature appearance probability and the emotional state transition probability; and
an information outputting step of outputting information about the emotional state for each section including one or more analysis frames based on the calculated emotional state probability.
1 Assignment
0 Petitions
Accused Products
Abstract
An audio feature is extracted from audio signal data for each analysis frame and stored in a storage part. Then, the audio feature is read from the storage part, and an emotional state probability of the audio feature corresponding to an emotional state is calculated using one or more statistical models constructed based on previously input learning audio signal data. Then, based on the calculated emotional state probability, the emotional state of a section including the analysis frame is determined.
-
Citations
21 Claims
-
1. An emotion detecting method that performs an emotion detecting processing based on an audio feature of input audio signal data, comprising:
-
an audio feature extracting step of extracting, as an audio feature vector, one or more of a fundamental frequency, a sequence of a temporal variation characteristic of the fundamental frequency, a power, a sequence of a temporal variation characteristic of the power, and a temporal variation characteristic of a speech rate from the audio signal data for each analysis frame, and storing the audio feature vector in a storage part; an audio feature appearance probability calculating step of reading the audio feature vector for each analysis frame and calculating the audio feature appearance probability that the audio feature vector appears on condition of sequences of predetermined emotional states corresponding to one or more types of emotions using a first statistical model constructed based on previously input learning audio signal data; an emotional state transition probability calculating step of calculating the probability of temporal transition of sequences of the predetermined emotional states as the emotional state transition probability using a second statistical model; an emotional state probability calculating step of calculating the emotional state probability based on the audio feature appearance probability and the emotional state transition probability; and an information outputting step of outputting information about the emotional state for each section including one or more analysis frames based on the calculated emotional state probability. - View Dependent Claims (4, 5, 8, 9, 10, 11)
-
-
2. An emotion detecting method that performs an emotion detecting processing based on an audio feature of input audio signal data, comprising:
-
an audio feature extracting step of extracting, as an audio feature vector, one or more of a fundamental frequency, a sequence of a temporal variation characteristic of the fundamental frequency, a power, a sequence of a temporal variation characteristic of the power, and a temporal variation characteristic of a speech rate from the audio signal data for each analysis frame, and storing the audio feature vector in a storage part; an emotional state probability processing step of reading the audio feature vector for each analysis frame and calculating the emotional state probability on condition of the audio feature vector for sequences of predetermined emotional states corresponding to one or more types of emotions using one or more statistical models constructed based on previously input learning audio signal data; an emotional state determining step of determining the emotional state of a section including the analysis frame based on the emotional state probability; and a step of outputting information about the determined emotional state. - View Dependent Claims (3, 6, 7)
-
-
12. An emotion detecting apparatus that performs an emotion detecting processing based on an audio feature of input audio signal data, comprising:
-
an audio feature extracting means for extracting as, an audio feature vector, one or more of a fundamental frequency, a sequence of a temporal variation characteristic of the fundamental frequency, a power, a sequence of a temporal variation characteristic of the power, and a temporal variation characteristic of a speech rate from the audio signal data for each analysis frame, and storing the audio feature vector in a storage part; an audio feature appearance probability calculating means for reading the audio feature vector for each analysis frame and calculating the audio feature appearance probability that the audio feature vector appears on condition of sequences of predetermined emotional states corresponding to one or more types of emotions using a first statistical model constructed based on previously input learning audio signal data; an emotional state transition probability calculating means for calculating the probability of temporal transition of sequences of the predetermined emotional states as the emotional state transition probability using a second statistical model; an emotional state probability calculating means for calculating the emotional state probability based on the audio feature appearance probability and the emotional state transition probability; and an information outputting means for outputting information about the emotional state for each section including one or more analysis frames based on the calculated emotional state probability. - View Dependent Claims (15, 16, 19, 20, 21)
-
-
13. An emotion detecting apparatus that performs an emotion detecting processing based on an audio feature of input audio signal data, comprising:
-
an audio feature extracting means for extracting, as an audio feature vector, one or more of a fundamental frequency, a sequence of a temporal variation characteristic of the fundamental frequency, a power, a sequence of a temporal variation characteristic of the power, and a temporal variation characteristic of a speech rate from the audio signal data for each analysis frame, and storing the audio feature vector in a storage part; an emotional state probability processing means for reading the audio feature vector for each analysis frame and calculating the emotional state probability on condition of the audio feature vector for sequences of predetermined emotional states corresponding to one or more types of emotions using one or more statistical models constructed based on previously input learning audio signal data; an emotional state determining means for determining the emotional state of a section including the analysis frame based on the emotional state probability; and an information outputting means for outputting information about the determined emotional state. - View Dependent Claims (14, 17, 18)
-
Specification