Emotional speech processing
First Claim
1. A method, comprising:
- receiving one or more speech samples, wherein the one or more speech samples are characterized by one or more emotions or speaking styles from one or more speakers;
generating a set of training data by extracting one or more acoustic features from every frame of the one or more sample speeches; and
generating a model from the set of training data, wherein the model identifies emotion or speaking style dependent information in the set of training data, wherein the model includes the application of a Probabilistic Linear Discriminant Analysis (PLDA) to identify an emotion related subspace.
2 Assignments
0 Petitions
Accused Products
Abstract
A method for emotion or speaking style recognition and/or clustering comprises receiving one or more speech samples, generating a set of training data by extracting one or more acoustic features from every frame of the one or more speech samples, and generating a model from the set of training data, wherein the model identifies emotion or speaking style dependent information in the set of training data. The method may further comprise receiving one or more test speech samples, generating a set of test data by extracting one or more acoustic features from every frame of the one or more test speeches, and transforming the set of test data using the model to better represent emotion/speaking style dependent information, and use the transformed data for clustering and/or classification to discover speech with similar emotion or speaking style. It is emphasized that this abstract is provided to comply with the rules requiring an abstract that will allow a searcher or other reader to quickly ascertain the subject matter of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims.
70 Citations
29 Claims
-
1. A method, comprising:
-
receiving one or more speech samples, wherein the one or more speech samples are characterized by one or more emotions or speaking styles from one or more speakers; generating a set of training data by extracting one or more acoustic features from every frame of the one or more sample speeches; and generating a model from the set of training data, wherein the model identifies emotion or speaking style dependent information in the set of training data, wherein the model includes the application of a Probabilistic Linear Discriminant Analysis (PLDA) to identify an emotion related subspace. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A system, comprising:
-
a processor module; a memory coupled to the processor, wherein the memory contains executable instructions configured to implement a method, the method comprising; receiving one or more speech samples; generating a set of training data by extracting one or more acoustic features from every frame of the one or more speech samples; and generating a model from the set of training data, wherein the model identifies emotion or speaking style dependent information in the set of training data, wherein the model includes the application of a Probabilistic Linear Discriminant Analysis (PLDA) to identify an emotion related subspace. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27)
-
-
28. A non-transitory computer readable medium having embodied therein computer readable instructions configured, to implement a method, the method comprising:
-
receiving one or more speech samples, wherein the one or more speech samples are characterized by one or more emotions or speaking styles from one or more speakers; generating a set of training data by extracting one or more acoustic features from every frame of the one or more sample speeches; and generating a model from the set of training data, wherein the model identifies emotion or speaking style dependent information in the set of training data, wherein the model includes the application of a Probabilistic Linear Discriminant Analysis (PLDA) to identify an emotion related subspace. - View Dependent Claims (29)
-
Specification