METHOD OF RECOGNIZING GENDER OR AGE OF A SPEAKER ACCORDING TO SPEECH EMOTION OR AROUSAL

US 20130268273A1
Filed: 07/27/2012
Published: 10/10/2013
Est. Priority Date: 04/10/2012
Status: Active Grant

First Claim

Patent Images

1. A method of recognizing gender or age of a speaker according to speech emotion or arousal, comprising steps of:

A) segmentalizing speech signals into a plurality of speech segments;

B) fetching the first speech segment from the speech segments to further acquire at least one of the emotional feature or the arousal degree of the speech segment;

C) determining at least one of the emotional feature or the arousal degree of the speech segment;

if the emotional feature is the object for determination, determine whether the emotional feature belongs to a specific emotion;

if the arousal degree is the object for determination, determine whether the arousal degree of the speech segment is greater or less than a specific threshold;

if either of the two answers is yes, proceed to the step D);

if none of the two answers is yes, return to the step B) and then fetch the next speech segment;

D) fetching the feature indicative of gender or age from the speech segment to further acquire at least one feature parameter corresponding to gender or age; and

E) applying recognition to the at least one feature parameter according to a gender or age recognition measure to further determine the gender or age of the speaker in the currently-processed speech segment;

next, apply the step B) to the next speech segment.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method of recognizing gender or age of a speaker according to speech emotion or arousal includes the following steps of A) segmentalizing speech signals into a plurality of speech segments; B) fetching the first speech segment from the plural speech segments to further acquire at least one of emotional features or arousal degree in the speech segment; C) determining whether at least one of the emotional feature and the arousal degree conforms to some condition; if yes, proceed to the step D); if no, return to the step B) and then fetch the next speech segment; D) fetching the feature indicative of gender or age from the speech segment to further acquire at least one feature parameter; and E) recognizing the at least one feature parameter to further determine the gender or age of the speaker at the currently-processed speech segment.

31 Citations

View as Search Results

15 Claims

1. A method of recognizing gender or age of a speaker according to speech emotion or arousal, comprising steps of:
- A) segmentalizing speech signals into a plurality of speech segments;
  
  B) fetching the first speech segment from the speech segments to further acquire at least one of the emotional feature or the arousal degree of the speech segment;
  
  C) determining at least one of the emotional feature or the arousal degree of the speech segment;
  
  if the emotional feature is the object for determination, determine whether the emotional feature belongs to a specific emotion;
  
  if the arousal degree is the object for determination, determine whether the arousal degree of the speech segment is greater or less than a specific threshold;
  
  if either of the two answers is yes, proceed to the step D);
  
  if none of the two answers is yes, return to the step B) and then fetch the next speech segment;
  
  D) fetching the feature indicative of gender or age from the speech segment to further acquire at least one feature parameter corresponding to gender or age; and
  
  E) applying recognition to the at least one feature parameter according to a gender or age recognition measure to further determine the gender or age of the speaker in the currently-processed speech segment;
  
  next, apply the step B) to the next speech segment.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. The method as defined in claim 1, wherein speech signals in the step A) is segmentalized by a segmentation unit.
  - 3. The method as defined in claim 1, wherein in the step B), after the first speech segment is fetched from the speech segments, apply the first classification to the emotional feature and the arousal degree of the speech segment to enable the emotional feature to be classified as a specific emotion and to enable the arousal to be classified as a high degree or a low degree of arousal.
  - 4. The method as defined in claim 3, wherein in the step B), the first speech segment is fetched by a first acquisition unit and the first classification is done via a first classifier.
  - 5. The method as defined in claim 1, wherein in the step C), the specific emotion is the presentation of positive/negative emotion and the arousal is the presentation of degree of the excitement.
  - 6. The method as defined in claim 1, wherein in the step C), whether the emotional feature belongs to a specific emotion and whether the arousal degree of the speech segment is greater or less than a specific threshold are determined by a determination unit.
  - 7. The method as defined in claim 1, wherein in the step D), after at least one feature parameter is acquired, apply the second classification to the at least one feature parameter.
  - 8. The method as defined in claim 7, wherein in the step D), the at least one feature parameter is fetched via a parameter acquisition unit and the second classification is done via a second classifier.
  - 9. The method as defined in claim 7 wherein in the step E), the gender or age recognition measure is based on the at least one feature parameter and then to determine the gender or age of the speaker according to the at least one feature parameter.
  - 10. The method as defined in claim 9, wherein in the step E), when multiple feature parameters are considered, the feature parameters are integrated and used to recognize the gender or age of the speaker.
  - 11. The method as defined in claim 7, wherein in the step D), whether the at least one feature parameter is remarkable or not in time domain or frequency domain is determined by whether it is greater than a specific mean or a specific standard deviation, where the mean and standard deviation of the feature parameter are computed from speech signals of multiple speakers.
  - 12. The method as defined in claim 1, wherein the at least one feature parameter is one of spectral centroid (SC), spectral spread (SS), zero crossing rate (ZCR), fast Fourier transformation (FFT) coefficients, jitter, and fundamental frequency (F0);
    - when the at least one feature parameter is plural in number, each of the feature parameters is one of SC, SS, ZCR, FFT coefficients, jitter, and F0 and the feature parameters are different from each other.
  - 13. The method as defined in claim 12, wherein SC, SS, FFT coefficients, jitter, and F0 belong to the frequency domain, and ZCR and duration belong to the time domain.
  - 14. The method as defined in claim 12, wherein SC, SS, ZCR, duration, FFT coefficients, and jitter are adopted for age recognition;
    - F0 and FFT coefficients are adopted for gender recognition.
  - 15. The method as defined in claim 1, wherein the steps A)-E) are executed in a computer.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
National Chung Cheng University
Original Assignee
National Chung Cheng University
Inventors
Chen, Oscal Tzyh-Chiang, Lu, Ping-Tsung, Ke, Jia-You

Granted Patent

US 9,123,342 B2
Time in Patent Office

Days
Field of Search
US Class Current

704/249
CPC Class Codes

G10L 17/26 Recognition of special voic...

G10L 25/63 for estimating an emotional...

METHOD OF RECOGNIZING GENDER OR AGE OF A SPEAKER ACCORDING TO SPEECH EMOTION OR AROUSAL

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

31 Citations

15 Claims

Specification

Solutions

Use Cases

Quick Links

METHOD OF RECOGNIZING GENDER OR AGE OF A SPEAKER ACCORDING TO SPEECH EMOTION OR AROUSAL

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

31 Citations

15 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links