Automatic Accent Detection With Limited Manually Labeled Data
First Claim
1. An accent detection system for automatically labeling accent in a large speech corpus, the accent detection system comprising:
- a first classifier configured to analyze words in the speech corpus and to automatically label accent of the analyzed words based on first criteria, the first classifier providing as an output first accent labels of the analyzed words;
a second classifier configured to analyze words in the speech corpus and to automatically label accent of the analyzed words based on second criteria, the second classifier providing as an output second accent labels of the analyzed words;
a comparison engine configured to compare the first accent labels provided by the first classifier and the second accent labels provided by the second classifier to determine if there is agreement between the first classifier and the second classifier on accent labels for particular words, for any words having first and second accent labels which indicate agreement by the first and second classifiers, the comparison engine providing the agreed upon accent labels as final accent labels for those words;
a third classifier which is configured to, for words in the speech corpus where the comparison engine determines that there is not agreement between the first and second classifiers, provide the final accent labels for those words as a function of the first accent labels for those words provided by the first classifier and the second accent labels for those words provided by the second classifier; and
an output component which provides as an output of the accent detection system the final accent labels provided by the comparison engine and by the third classifier.
2 Assignments
0 Petitions
Accused Products
Abstract
An accent detection system for automatically labeling accent in a large speech corpus includes a first classifier which analyzes words in the speech corpus and automatically labels accents to provide first accent labels. A second classifier analyzes the words to automatically label accent of the words to provide second accent labels. A comparison engine compares the first and second accent labels. Accent labels that indicate agreement between the first and second classifiers are provided as final accent labels. When there is disagreement between the first and second classifiers, a third classifier analyzes the words and provides the final accent labels.
32 Citations
20 Claims
-
1. An accent detection system for automatically labeling accent in a large speech corpus, the accent detection system comprising:
-
a first classifier configured to analyze words in the speech corpus and to automatically label accent of the analyzed words based on first criteria, the first classifier providing as an output first accent labels of the analyzed words; a second classifier configured to analyze words in the speech corpus and to automatically label accent of the analyzed words based on second criteria, the second classifier providing as an output second accent labels of the analyzed words; a comparison engine configured to compare the first accent labels provided by the first classifier and the second accent labels provided by the second classifier to determine if there is agreement between the first classifier and the second classifier on accent labels for particular words, for any words having first and second accent labels which indicate agreement by the first and second classifiers, the comparison engine providing the agreed upon accent labels as final accent labels for those words; a third classifier which is configured to, for words in the speech corpus where the comparison engine determines that there is not agreement between the first and second classifiers, provide the final accent labels for those words as a function of the first accent labels for those words provided by the first classifier and the second accent labels for those words provided by the second classifier; and an output component which provides as an output of the accent detection system the final accent labels provided by the comparison engine and by the third classifier. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer-implemented method of training a classifier when limited manually labeled accent data is available, the method comprising:
-
obtaining a database having data without manually generated accent labels; using a first classifier to automatically accent label the data in the database; and training a second classifier using the automatically accent labeled data in the database. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A computer-implemented method of automatically labeling accent in a large speech corpus, the method comprising:
-
analyzing words in the speech corpus using a first classifier to automatically label accent of the analyzed words based on first criteria and to generate first accent labels for the analyzed words; analyzing words in the speech corpus using a second classifier to automatically label accent of the analyzed words based on second criteria and to generate second accent labels for the analyzed words; comparing the first accent labels and the second accent labels to determine if there is agreement between the first classifier and the second classifier on accent labels for particular words, and for any words having first and second accent labels which indicate agreement by the first and second classifiers, providing the agreed upon accent labels as final accent labels for those words; analyzing words in the speech corpus, for which it was determined that there is not agreement between the first and second classifiers, using a third classifier to provide the final accent labels for those words as a function of the first accent labels for those words provided by the first classifier and the second accent labels for those words provided by the second classifier; and providing as an output the final accent labels. - View Dependent Claims (18, 19, 20)
-
Specification