System and Method of Automated Evaluation of Transcription Quality
First Claim
Patent Images
1. A method of automated evaluation of a transcription quality, the method comprising:
- obtaining audio data;
segmenting the audio data into a plurality of utterances with a voice activity detector operating on a computer processor;
transcribing the plurality of utterances into at least one word lattice with a large vocabulary continuous speech recognition system operating on the processor;
applying, with the processor, a minimum Bayes risk decoder to the at least one word lattice to create at least one confusion network representing the at least one word lattice as a plurality of sequential word bins and ε
-bins; and
calculating at least one conformity ratio from the at least one confusion network.
2 Assignments
0 Petitions
Accused Products
Abstract
Systems and methods automatedly evaluate a transcription quality. Audio data is obtained. The audio data is segmented into a plurality of utterances with a voice activity detector operating on a computer processor. The plurality of utterances are transcribed into at least one word lattice with a large vocabulary continuous speech recognition system operating on the processor. A minimum Bayes risk decoder is applied to the at least one word lattice to create at least one confusion network. At least conformity ratio is calculated from the at least one confusion network.
-
Citations
20 Claims
-
1. A method of automated evaluation of a transcription quality, the method comprising:
-
obtaining audio data; segmenting the audio data into a plurality of utterances with a voice activity detector operating on a computer processor; transcribing the plurality of utterances into at least one word lattice with a large vocabulary continuous speech recognition system operating on the processor; applying, with the processor, a minimum Bayes risk decoder to the at least one word lattice to create at least one confusion network representing the at least one word lattice as a plurality of sequential word bins and ε
-bins; andcalculating at least one conformity ratio from the at least one confusion network. - View Dependent Claims (2, 3, 4, 8, 9, 10, 11, 12, 13)
-
-
5. The method of clam 4, wherein calculating the conformity ratio for each confusion network further comprises:
-
identifying a probability value of a most probable word arc in each word bin; and calculating a joint probability for each ε
-bin and a preceding word bin;wherein the conformity ratio is an average of the calculated joint probabilities for the confusion network. - View Dependent Claims (6, 7)
-
-
14. A system for automated evaluation of transcription quality, the system comprising:
-
an audio data source upon which a plurality of audio data files are stored; a processor that receives the plurality of audio data files, segments the audio data files into a plurality of utterances and applies at least one transcription model to the plurality of utterances to transcribe the plurality of utterances into a word lattice; a non-transient computer readable medium communicatively connected to the processor and programmed with computer readable code that when executed by the processor causes the processor to; apply a minimum Bayes risk decoder to the at least one word lattice to create at least one confusion network representing the at least one word lattice as a plurality of sequential word bins and ε
-bins;calculate at least one conformity ratio from the at least one confusion network; and calculate a transcription quality score from the at least one conformity ratio. - View Dependent Claims (15, 16)
-
-
17. A non-transient computer readable medium programmed with computer readable code that upon execution by as processor causes the processor to:
-
obtain audio data; segment the audio data into a plurality of utterances with a voice activity detector; transcribe the plurality of utterances into at least one word lattice with a large vocabulary continuous speech recognition system; apply a minimum Bayes risk decoder to the at least one word lattice to create at least one confusion network representing the at least one word lattice as a plurality of sequential word bins and ε
-bins;calculate at least one conformity ratio from the at least one confusion network; calculate a transcription quality score from the at least one conformity ratio; and provide an indication of the transcription quality score. - View Dependent Claims (18, 19, 20)
-
Specification