AUDIO ANALYSIS SYSTEM, AUDIO ANALYSIS APPARATUS, AUDIO ANALYSIS TERMINAL
First Claim
1. An audio analysis system comprising:
- a terminal apparatus that is to be worn by a user; and
a host system that acquires information from the terminal apparatus,wherein the terminal apparatus includesa first audio acquisition device that acquires a sound and converts the sound into a first audio signal, the sound containing an utterance of the user and an utterance of another person who is different from the user,a discriminator that discriminates between a portion that corresponds to the utterance of the user and a portion that corresponds to the utterance of the other person which are contained in the first audio signal,an utterance feature detector that detects an utterance feature of the user or the other person, on the basis of the portion that corresponds to the utterance of the user or the portion that corresponds to the utterance of the other person, anda transmission unit that transmits to the host system utterance information that contains at least a discrimination result obtained by the discriminator and a detection result obtained by the utterance feature detector, andwherein the host system includesa reception unit that receives the utterance information that has been transmitted from the transmission unit,a conversation information detector that detects a part corresponding to a first conversation between the user and the other person from the utterance information that has been received by the reception unit, and detects portions of the part of the utterance information that correspond to the user and the other person who are related to the first conversation,a relation information holding unit that holds relation information on a relation between a predetermined emotion name and a combination of a plurality of the utterance features of a plurality of speakers who participated in a past conversation,an emotion estimator that compares, with the relation information, a combination of a plurality of the utterance features that correspond to the portions of the part of the utterance information of the user and the other person who are related to the first conversation, and estimates an emotion of at least one of the user and the other person, andan output unit that outputs information that is based on an estimation result obtained by the emotion estimator.
2 Assignments
0 Petitions
Accused Products
Abstract
An audio analysis system includes a terminal apparatus and a host system. The terminal apparatus acquires an audio signal of a sound containing utterances of a user and another person, discriminates between portions of the audio signal corresponding to the utterances of the user and the other person, detects an utterance feature based on the portion corresponding to the utterance of the user or the other person, and transmits utterance information including the discrimination and detection results to the host system. The host system detects a part corresponding to a conversation from the received utterance information, detects portions of the part of the utterance information corresponding to the user and the other person, compares a combination of plural utterance features corresponding to the portions of the part of the utterance information of the user and the other person with relation information to estimate an emotion, and outputs estimation information.
64 Citations
25 Claims
-
1. An audio analysis system comprising:
-
a terminal apparatus that is to be worn by a user; and a host system that acquires information from the terminal apparatus, wherein the terminal apparatus includes a first audio acquisition device that acquires a sound and converts the sound into a first audio signal, the sound containing an utterance of the user and an utterance of another person who is different from the user, a discriminator that discriminates between a portion that corresponds to the utterance of the user and a portion that corresponds to the utterance of the other person which are contained in the first audio signal, an utterance feature detector that detects an utterance feature of the user or the other person, on the basis of the portion that corresponds to the utterance of the user or the portion that corresponds to the utterance of the other person, and a transmission unit that transmits to the host system utterance information that contains at least a discrimination result obtained by the discriminator and a detection result obtained by the utterance feature detector, and wherein the host system includes a reception unit that receives the utterance information that has been transmitted from the transmission unit, a conversation information detector that detects a part corresponding to a first conversation between the user and the other person from the utterance information that has been received by the reception unit, and detects portions of the part of the utterance information that correspond to the user and the other person who are related to the first conversation, a relation information holding unit that holds relation information on a relation between a predetermined emotion name and a combination of a plurality of the utterance features of a plurality of speakers who participated in a past conversation, an emotion estimator that compares, with the relation information, a combination of a plurality of the utterance features that correspond to the portions of the part of the utterance information of the user and the other person who are related to the first conversation, and estimates an emotion of at least one of the user and the other person, and an output unit that outputs information that is based on an estimation result obtained by the emotion estimator. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. An audio analysis system comprising:
-
a first terminal apparatus that is to be worn by a first user; a second terminal apparatus that is to be worn by a second user; and a host system that acquires information from the first terminal apparatus and the second terminal apparatus, wherein the first terminal apparatus includes a first audio acquisition device that acquires a sound and converts the sound into a first audio signal, the sound containing an utterance of the first user and an utterance of another person who is different from the first user, a first discriminator that discriminates between a portion that corresponds to the utterance of the first user and a portion that corresponds to the utterance of the other person which are contained in the first audio signal, a first utterance feature detector that detects a first utterance feature of the first user, on the basis of the portion that corresponds to the utterance of the first user or the portion that corresponds to the utterance of the other person which is contained in the first audio signal, and a first transmission unit that transmits to the host system first utterance information that contains at least a discrimination result obtained by the first discriminator and a detection result regarding the first utterance feature obtained by the first utterance feature detector, wherein the second terminal apparatus includes a second audio acquisition device that acquires a sound and converts the sound into a second audio signal, a second discriminator that discriminates between a portion that corresponds to an utterance of the second user and a portion that corresponds to an utterance of another person who is different from the second user, the portions being contained in the second audio signal, a second utterance feature detector that detects a second utterance feature of the second user, on the basis of the portion that corresponds to the utterance of the second user or the portion that corresponds to the utterance of the other person which is contained in the second audio signal, and a second transmission unit that transmits to the host system second utterance information that contains at least a discrimination result obtained by the second discriminator and a detection result regarding the second utterance feature obtained by the second utterance feature detector, and wherein the host system includes a reception unit that receives the first utterance information and the second utterance information that have been transmitted from the first and second transmission units, respectively, a conversation information detector that detects a first part corresponding to a first conversation between the first user and the other person who is different from the first user from the first utterance information that has been received by the reception unit, and detects portions of the first part of the first utterance information that correspond to the first user and the other person who are related to the first conversation, and that detects a second part corresponding to a second conversation between the second user and the other person who is different from the second user from the second utterance information that has been received by the reception unit, and detects portions of the second part of the second utterance information that correspond to the second user and the other person who are related to the second conversation, wherein the conversation information detector determines whether or not the first conversation and the second conversation are the same conversation between the first user and the second user on the basis of a comparison of the portions of the first part of the first utterance information that correspond to the first user and the other person who is different from the first user with the portions of the second part of the second utterance information that correspond to the second user and the other person who is different from the second user, a relation information holding unit that holds relation information on a relation between a predetermined emotion name and a combination of a plurality of utterance features of a plurality of speakers who participated in a past conversation, an emotion estimator that compares, with the relation information, a combination of the first and second utterance features related to the conversation between the first user and the second user, and estimates an emotion of at least one of the first user and the second user, and an output unit that outputs information that is based on an estimation result obtained by the emotion estimator. - View Dependent Claims (12, 13, 14)
-
-
15. An audio analysis apparatus comprising:
-
an acquisition unit that acquires information on an utterance feature which is detected on the basis of an audio signal of a sound containing an utterance of a speaker; a relation information holding unit that holds relation information on a relation between a predetermined emotion name and a plurality of utterance features corresponding to a plurality of parts of utterance information of the speaker; an emotion estimator that compares, with the relation information, a plurality of utterance features of the speaker related to a specific conversation, and estimates an emotion of the speaker; and an output unit that outputs information that is based on an estimation result obtained by the emotion estimator. - View Dependent Claims (16, 17)
-
-
18. An audio analysis terminal comprising:
-
a first audio acquisition device that acquires a sound and converts the sound into a first audio signal, the sound containing an utterance of a user and an utterance of another person who is different from the user; a discriminator that discriminates between a portion that corresponds to the utterance of the user and a portion that corresponds to the utterance of the other person which are contained in the first audio signal; an utterance feature detector that detects an utterance feature of the user or the other person, on the basis of the portion that corresponds to the utterance of the user or the portion that corresponds to the utterance of the other person; and a transmission unit that transmits to a host system utterance information that contains at least a discrimination result obtained by the discriminator and a detection result obtained by the utterance feature detector. - View Dependent Claims (19, 20, 21, 22, 23, 24, 25)
-
Specification