Utterance verification and pronunciation scoring by lattice transduction
First Claim
Patent Images
1. A method comprising:
- defining, at a computer, a phoneme lattice for an ideal utterance, the phoneme lattice including a plurality of phoneme instances;
tagging, at the computer, each phoneme instance from the plurality of phoneme instances with a begin time, an end time, and a score;
storing the phoneme lattice at the computer;
determining an ideal path through the phoneme lattice for the ideal utterance;
receiving, at the computer, an input utterance from a user;
transducing the input utterance into the ideal utterance utilizing the phoneme lattice;
calculating, at the computer, an ideal path transduction cost based on the ideal path and the begin time, the end time, and the score of each phoneme instance from the plurality of phoneme instances included in the ideal path;
transducing the input utterance into an out-of-grammar word sequence;
calculating, at the computer, an out-of-grammar transduction cost based on the out-of-grammar word sequence;
determining an accuracy of the input utterance based on the ideal path transduction cost and the out-of-grammar transduction cost; and
sending a signal to output from the computer an indication of the accuracy.
11 Assignments
0 Petitions
Accused Products
Abstract
In the field of language learning systems, proper pronunciation of words and phrases is an integral aspect of language learning, determining the proximity of the language learner'"'"'s pronunciation to a standardized, i.e. ‘perfect’, pronunciation is utilized to guide the learner from imperfect toward perfect pronunciation. In this regard, a phoneme lattice scoring system is utilized, whereby an input from a user is transduced into the perfect pronunciation example in a phoneme lattice. The cost of this transduction may be determined based on a summation of substitutions, deletions and insertions of phonemes needed to transducer from the input to the perfect pronunciation of the utterance.
-
Citations
19 Claims
-
1. A method comprising:
-
defining, at a computer, a phoneme lattice for an ideal utterance, the phoneme lattice including a plurality of phoneme instances; tagging, at the computer, each phoneme instance from the plurality of phoneme instances with a begin time, an end time, and a score; storing the phoneme lattice at the computer; determining an ideal path through the phoneme lattice for the ideal utterance; receiving, at the computer, an input utterance from a user; transducing the input utterance into the ideal utterance utilizing the phoneme lattice; calculating, at the computer, an ideal path transduction cost based on the ideal path and the begin time, the end time, and the score of each phoneme instance from the plurality of phoneme instances included in the ideal path; transducing the input utterance into an out-of-grammar word sequence; calculating, at the computer, an out-of-grammar transduction cost based on the out-of-grammar word sequence; determining an accuracy of the input utterance based on the ideal path transduction cost and the out-of-grammar transduction cost; and sending a signal to output from the computer an indication of the accuracy. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A non-transitory processor-readable medium storing code representing instructions to be executed by a processor, the code comprising code to cause the processor to:
-
tag each phoneme instance from a plurality of ideal phoneme instances for an ideal utterance with a begin time, an end time, and a score to generate an ideal phoneme lattice; determine a target phoneme sequence based on the ideal phoneme lattice for the ideal utterance; receive an input utterance from a user; define an input phoneme lattice based on the input utterance; transduce the input phoneme lattice into the target phoneme sequence; calculate an ideal path transduction cost of the input utterance based on the begin time, the end time, and the score of each phoneme instance from the plurality of ideal phoneme instances; tag each phoneme instance from a plurality of out-of-grammar phoneme instances for an out-of-grammar utterance with a begin time, an end time, and a score to generate an out-of-grammar phoneme lattice; determine an out-of-grammar phoneme sequence based on the out-of-grammar phoneme lattice for the out-of-grammar utterance; transduce the phoneme lattice of the input utterance into the out-of-grammar phoneme sequence; calculate an out-of-grammar transduction cost of the input utterance based on the begin time, the end time, and the score of each phoneme instance from the plurality of out-of-grammar phoneme instances; and send a signal to output from the computer an indication of an accuracy of the input utterance based on the ideal path transduction cost and the out-of-grammar transduction cost. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
Specification