Method for detecting emotions involving subspace specialists

US 7,729,914 B2
Filed: 10/04/2002
Issued: 06/01/2010
Est. Priority Date: 10/05/2001
Status: Expired due to Fees

First Claim

Patent Images

1. A method for detecting emotions from speech input comprising:

evaluating, deriving and/or extracting at least a first feature class and a second feature class of features from a given speech input, wherein the first feature class does not include features of the second feature class and the second feature class does not include features of the first feature class, and the first feature class includes prosodic features and the second feature class includes voice quality features;

associating said first and second feature classes with dimensions of an underlying emotional space including a first dimension of activation or arousal and a second dimension of evaluation or pleasure, respectively;

using for each dimension of the underlying emotional space a specialized classifier system, each of which being configured to classify features from an assigned feature class associated with a respective feature class, wherein each specialized classifier system operates independently from each other, and each of the specialized classifier systems uses as input only features of a respectively assigned feature class; and

combining outputs of said specialized classifier systems for each feature class to form a global classifier system configured to output a current emotional state.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

To detect and determine a current emotional state (CES) of a human being from a spoken speech input (SI), it is suggested in a method for detecting emotions to identify first and second feature classes (A, E) with, in particular distinct, dimensions of an underlying emotional manifold (EM) or emotional space (ES) and/or with subspaces thereof.

18 Citations

View as Search Results

10 Claims

1. A method for detecting emotions from speech input comprising:
- evaluating, deriving and/or extracting at least a first feature class and a second feature class of features from a given speech input, wherein the first feature class does not include features of the second feature class and the second feature class does not include features of the first feature class, and the first feature class includes prosodic features and the second feature class includes voice quality features;
  
  associating said first and second feature classes with dimensions of an underlying emotional space including a first dimension of activation or arousal and a second dimension of evaluation or pleasure, respectively;
  
  using for each dimension of the underlying emotional space a specialized classifier system, each of which being configured to classify features from an assigned feature class associated with a respective feature class, wherein each specialized classifier system operates independently from each other, and each of the specialized classifier systems uses as input only features of a respectively assigned feature class; and
  
  combining outputs of said specialized classifier systems for each feature class to form a global classifier system configured to output a current emotional state.
- View Dependent Claims (2, 3, 4)
- - 2. A method according to claim 1, wherein said distinct specialized classifier systems are applied to the different feature classes already extracted from said speech input and/or to said speech input directly, and wherein derived parameters of emotions from different feature subspaces, are collected and/or stored to obtain a final classification by combination of the results.
  - 3. A system for detecting emotions from speech input which is capable of performing and/or realizing a method for detecting emotions according to claim 1.
  - 4. A computer readable medium including computer and readable instructions that when executed by a processor perform and/or to realize a method for detecting emotions according to claim 1.

5. A method of detecting emotions, comprising:
- providing a first feature class and a second feature class of features of a speech input, wherein said first feature class comprises prosodic features and corresponds to a first dimension of an emotional space, and wherein said second feature class comprises voice quality features and corresponds to a second dimension of said emotional space, wherein the first feature class does not include features of the second feature class and the second feature class does not include features of the first feature class;
  
  using for said first and second feature classes specialized classifier systems including a first and second classifier, respectively, wherein said first and second classifiers are configured to classify features of said first and second feature classes, respectively, said second classifier includes a plurality of single classifiers, said first classifier and said second classifier operating independently from one another, and each of the specialized classifier systems uses as input only features of a respectively assigned feature class; and
  
  combining outputs of said first and second classifiers to form a global classifier configured to output a current emotional state.
- View Dependent Claims (6, 7, 8, 10)
- - 6. The method according to claim 5, wherein the first classifier gives as an output a degree of arousal.
  - 7. The method according to claim 5, wherein the second classifier gives as an output a degree of pleasure.
  - 8. The method according to claim 5, wherein said second classifier comprises a plurality of classifiers each of which is trained to model speaker dependencies, wherein said speaker dependencies include age and/or gender.
  - 10. The method according to claim 5, wherein the second classifier classifies features of the second feature class based at least in part on speaker dependencies.

9. A method of detecting emotions, comprising:
- providing a first feature class and a second feature class of features of a speech input, wherein said first feature class comprises prosodic features and corresponds to a first dimension of an emotional space, and wherein said second feature class comprises voice quality features and corresponds to a second dimension of said emotional space, wherein the first feature class does not include features of the second feature class and the second feature class does not include features of the first feature class;
  
  using for said first and second feature classes specialized classifier systems including a first and second classifier, respectively, wherein said first and second classifiers are configured to classify features of said first and second feature classes, respectively, said first and second classifiers operate independently from each other, and each of the specialized classifier systems uses as input only features of a respectively assigned feature class; and
  
  combining outputs of said first and second classifiers to form a global classifier configured to output a current emotional state.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sony Deutschland GmbH (Sony Group Corp.)
Original Assignee
Sony Deutschland GmbH (Sony Group Corp.)
Inventors
Kemp, Thomas, Marasek, Krzystof, Tato, Raquel
Primary Examiner(s)
Armstrong; Angela A

Application Number

US10/264,643
Publication Number

US 20030069728A1
Time in Patent Office

2,797 Days
Field of Search

704/270, 704/231
US Class Current

704/270
CPC Class Codes

G10L 17/26 Recognition of special voic...

Method for detecting emotions involving subspace specialists

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

18 Citations

10 Claims

Specification

Solutions

Use Cases

Quick Links

Method for detecting emotions involving subspace specialists

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

18 Citations

10 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links