Systems and Methods for Assessment of Non-Native Spontaneous Speech
First Claim
1. A computer-implemented method of assessing spontaneous speech pronunciation, comprising:
- performing speech recognition on digitized speech using a non-native acoustic model trained with non-native speech using a processing system to generate word hypotheses for the digitized speech;
performing time alignment between the digitized speech and the word hypotheses utilizing a reference acoustic model trained with native-quality speech to associate word hypotheses with corresponding sounds of the digitized speech;
calculating statistics regarding individual words and phonemes of the word hypotheses using the processing system based on said alignment;
calculating a plurality of features for use in assessing pronunciation of the digitized speech based on the statistics using the processing system;
calculating an assessment score based on one or more of the calculated features; and
storing the assessment score in a computer-readable memory.
1 Assignment
0 Petitions
Accused Products
Abstract
Computer-implemented systems and methods are provided for assessing non-native spontaneous speech pronunciation. Speech recognition on digitized speech is performed using a non-native acoustic model trained with non-native speech to generate word hypotheses for the digitized speech. Time alignment is performed between the digitized speech and the word hypotheses using a reference acoustic model trained with native-quality speech. Statistics are calculated regarding individual words and phonemes in the word hypotheses based on the alignment. A plurality of features for use in assessing pronunciation of the speech are calculated based on the statistics, an assessment score is calculated based on one or more of the calculated features, and the assessment score is stored in a computer-readable memory.
40 Citations
29 Claims
-
1. A computer-implemented method of assessing spontaneous speech pronunciation, comprising:
-
performing speech recognition on digitized speech using a non-native acoustic model trained with non-native speech using a processing system to generate word hypotheses for the digitized speech; performing time alignment between the digitized speech and the word hypotheses utilizing a reference acoustic model trained with native-quality speech to associate word hypotheses with corresponding sounds of the digitized speech; calculating statistics regarding individual words and phonemes of the word hypotheses using the processing system based on said alignment; calculating a plurality of features for use in assessing pronunciation of the digitized speech based on the statistics using the processing system; calculating an assessment score based on one or more of the calculated features; and storing the assessment score in a computer-readable memory. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 24, 28)
-
-
14. A computer-implemented system for assessing spontaneous speech pronunciation, comprising:
-
a processing system; a computer-readable memory programmed with instructions for causing the processing system to perform steps including; performing speech recognition on digitized speech using a non-native acoustic model trained with non-native speech using a processing system to generate word hypotheses for the digitized speech; performing time alignment between the digitized speech and the word hypotheses utilizing a reference acoustic model trained with native-quality speech to associate word hypotheses with corresponding sounds of the digitized speech; calculating statistics regarding individual words and phonemes of the word hypotheses using the processing system based on said alignment; calculating a plurality of features for use in assessing pronunciation of the speech based on the statistics using the processing system; calculating an assessment score based on one or more of the calculated features; and storing the assessment score in a computer-readable memory. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21, 22, 23, 25, 26, 29)
-
-
27. A computer-readable memory comprising computer-readable instructions, which when executed cause a processing system to perform steps comprising:
-
performing speech recognition on digitized speech using a non-native acoustic model trained with non-native speech using a processing system to generate word hypotheses for the digitized speech; performing time alignment between the digitized speech and the word hypotheses utilizing a reference acoustic model trained with native-quality speech; calculating statistics regarding individual words and phonemes of the word hypotheses using the processing system based on said alignment; calculating a plurality of features for use in assessing pronunciation of the speech based on the statistics using the processing system; calculating an assessment score based on one or more of the calculated features; and storing the assessment score in a computer-readable memory.
-
Specification