Systems and methods for automated evaluation of human speech
First Claim
1. A system for performing automated proficiency scoring of speech, the system comprising:
- a microphone coupled with a computing device comprising a microprocessor, a memory, and a display operatively coupled together;
wherein the microphone is configured to receive an audible unconstrained speech utterance from a user whose proficiency in a language is being tested and provide a corresponding audio signal to the computing device; and
wherein the microprocessor and memory are configured to;
receive the audio signal; and
process the audio signal by;
recognizing a plurality of phones and a plurality of pauses comprised in the audio signal corresponding with the utterance;
dividing the plurality of phones and plurality of pauses into a plurality of tone units;
grouping the plurality of phones into a plurality of syllables;
identifying a plurality of filled pauses from among the plurality of pauses;
detecting a plurality of prominent syllables from among the plurality of syllables;
identifying, from among the plurality of prominent syllables, a plurality of tonic syllables;
identifying a tone choice for each of the tonic syllables of the plurality of tonic syllables to form a plurality of tone choices;
calculating a relative pitch for each of the tonic syllables of the plurality of tonic syllables to form a plurality of relative pitch values;
calculating a plurality of suprasegmental parameters using one of the plurality of pauses, the plurality of filled pauses, the plurality of tone units, the plurality of syllables, the plurality of prominent syllables, the plurality of tone choices the plurality of relative pitch values, and any combination thereof;
using the plurality of suprasegmental parameters, calculating a language proficiency rating for the user; and
displaying the language proficiency rating of the user on the display associated with the computing device using the microprocessor and the memory.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods for evaluating human speech. Implementations may include: a microphone coupled with a computing device comprising a microprocessor, a memory, and a display operatively coupled together. The microphone may be configured to receive an audible unconstrained speech utterance from a user whose proficiency in a language is being tested and provide a corresponding audio signal to the computing device. The microprocessor and memory may receive the audio signal and process the audio signal by recognizing a plurality of phones and a plurality of pauses and calculate a plurality of suprasegmental parameters using the plurality of pauses and the plurality of phones. The microprocessor and memory may use the plurality of suprasegmental parameters to calculate a language proficiency rating for the user and display the language proficiency rating of the user on the display associated with the computing device.
45 Citations
20 Claims
-
1. A system for performing automated proficiency scoring of speech, the system comprising:
-
a microphone coupled with a computing device comprising a microprocessor, a memory, and a display operatively coupled together; wherein the microphone is configured to receive an audible unconstrained speech utterance from a user whose proficiency in a language is being tested and provide a corresponding audio signal to the computing device; and wherein the microprocessor and memory are configured to; receive the audio signal; and process the audio signal by; recognizing a plurality of phones and a plurality of pauses comprised in the audio signal corresponding with the utterance; dividing the plurality of phones and plurality of pauses into a plurality of tone units; grouping the plurality of phones into a plurality of syllables; identifying a plurality of filled pauses from among the plurality of pauses; detecting a plurality of prominent syllables from among the plurality of syllables; identifying, from among the plurality of prominent syllables, a plurality of tonic syllables; identifying a tone choice for each of the tonic syllables of the plurality of tonic syllables to form a plurality of tone choices; calculating a relative pitch for each of the tonic syllables of the plurality of tonic syllables to form a plurality of relative pitch values; calculating a plurality of suprasegmental parameters using one of the plurality of pauses, the plurality of filled pauses, the plurality of tone units, the plurality of syllables, the plurality of prominent syllables, the plurality of tone choices the plurality of relative pitch values, and any combination thereof; using the plurality of suprasegmental parameters, calculating a language proficiency rating for the user; and displaying the language proficiency rating of the user on the display associated with the computing device using the microprocessor and the memory. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method of performing automated proficiency scoring of speech, the method comprising:
-
generating an audio signal using a microphone by receiving an audible unconstrained speech utterance from a user whose proficiency in a language is being tested; providing the audio signal to a computing device coupled with the microphone, the computing device comprising a microprocessor, a memory, and a display operatively coupled together; processing the audio signal using the microprocessor and memory by; recognizing a plurality of phones and a plurality of pauses comprised in the audio signal corresponding with the utterance; dividing the plurality of phones and plurality of pauses into a plurality of tone units; grouping the plurality of phones into a plurality of syllables; identifying a plurality of filled pauses from among the plurality of pauses; detecting a plurality of prominent syllables from among the plurality of syllables; identifying, from among the plurality of prominent syllables, a plurality of tonic syllables; identifying a tone choice for each of the tonic syllables of the plurality of tonic syllables to form a plurality of tone choices; calculating a relative pitch for each of the tonic syllables of the plurality of tonic syllables to form a plurality of relative pitch values; calculating a plurality of suprasegmental parameters using one of the plurality of pauses, the plurality of filled pauses, the plurality of tone units, the plurality of syllables, the plurality of prominent syllables, the plurality of tone choices, the plurality of relative pitch values, and any combination thereof; using the plurality of suprasegmental parameters, calculating a language proficiency rating for the user; and displaying the language proficiency rating of the user on the display associated with the computing device using the microprocessor and the memory. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A method of calculating a plurality of suprasegmental values for an utterance, the method comprising:
-
generating an audio signal using a microphone by receiving an audible unconstrained speech utterance from a user; providing the audio signal to a computing device coupled with the microphone, the computing device comprising a microprocessor, a memory, and a display operatively coupled together; processing the audio signal using the microprocessor and memory by; recognizing a plurality of phones and a plurality of pauses comprised in the audio signal corresponding with the utterance; dividing the plurality of phones and plurality of pauses into a plurality of tone units; grouping the plurality of phones into a plurality of syllables; identifying a plurality of filled pauses from among the plurality of pauses; detecting a plurality of prominent syllables from among the plurality of syllables; identifying, from among the plurality of prominent syllables, a plurality of tonic syllables; identifying a tone choice for each of the tonic syllables of the plurality of tonic syllables to form a plurality of tone choices; calculating a relative pitch for each of the tonic syllables of the plurality of tonic syllables to form a plurality of relative pitch values; and calculating a plurality of suprasegmental parameters using one of the plurality of pauses, the plurality of filled pauses, the plurality of tone units, the plurality of syllables, the plurality of prominent syllables, the plurality of tone choices, the plurality of relative pitch values, and any combination thereof.
-
Specification