Systems and methods for evaluating speaker suitability for automatic speech recognition aided transcription
First Claim
1. A method for determining a speaker'"'"'s suitability for automated speech recognition in a transcription system comprising the steps of:
- accumulating a predetermined number of the speaker'"'"'s voice files and associated text files, wherein the associated text files are transcriptions of the voice files transcribed by a transcriptionist;
providing a voice file to a speech recognition engine to generate a recognized text file;
determining a score based on the recognized text file generated from the voice file and the associated text file of the voice file;
determining a preferred mode of transcription based on a predetermined number of scores, wherein the predetermined number of scores are scores generated from recognized text files and associated text files corresponding to voice files generated by the speaker; and
setting a transcription mode for the speaker, wherein the transcription mode indicates the method of transcription thereafter used in the transcription system.
9 Assignments
0 Petitions
Accused Products
Abstract
The invention is a method for determining the most efficient mode of transcription in a transcription system utilizing both a human transcriptionist and automated speech recognition, and systems employing this method. The invention allows determination of speaker suitability for automated speech recognition based on voice files that have already been transcribed by a human transcriptionist, and thus does not generally require a speaker to read a transcript and does not generally require a transcriptionist to transcribe a voice file specifically for the purposes of the determination. The invention allows one of several different modes of transcription to be associated with the speaker, and provides a method for determining which of these several different modes would maximize the efficiency of the transcription system for transcribing voice files generated by the speaker.
-
Citations
74 Claims
-
1. A method for determining a speaker'"'"'s suitability for automated speech recognition in a transcription system comprising the steps of:
-
accumulating a predetermined number of the speaker'"'"'s voice files and associated text files, wherein the associated text files are transcriptions of the voice files transcribed by a transcriptionist;
providing a voice file to a speech recognition engine to generate a recognized text file;
determining a score based on the recognized text file generated from the voice file and the associated text file of the voice file;
determining a preferred mode of transcription based on a predetermined number of scores, wherein the predetermined number of scores are scores generated from recognized text files and associated text files corresponding to voice files generated by the speaker; and
setting a transcription mode for the speaker, wherein the transcription mode indicates the method of transcription thereafter used in the transcription system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35)
-
-
36. A transcription system comprising:
-
a server for accumulating a predetermined number of the speaker'"'"'s voice files and associated text files, wherein the associated text files are transcriptions of the voice files transcribed by a transcriptionist;
a speech recognition engine, wherein a voice file is provided to said speech recognition engine and a recognized text file is generated therefrom;
a means for determining a score based on the recognized text file generated from the voice file and the associated text file of the voice file;
a means for determining a preferred mode of transcription based on a predetermined number of scores, wherein the predetermined number of scores are scores generated from recognized text files and associated text files corresponding to voice files generated by the speaker; and
a means for setting a transcription mode for the speaker, wherein the transcription mode indicates the method of transcription thereafter used in the transcription system. - View Dependent Claims (37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70)
-
-
71. A method for determining a speaker'"'"'s suitability for automated speech recognition in a transcription system comprising the steps of:
-
accumulating a predetermined number of the speaker'"'"'s voice files and associated text files, wherein the associated text files are transcriptions of the voice files transcribed by a transcriptionist;
providing a voice file to a speech recognition engine to generate a recognized text file;
determining a score based on the recognized text file generated from the voice file and the associated text file of the voice file, wherein the score is based on an edit distance between the associated text file and the recognized text file, the number of pauses in the voice file, the amount of silence in the voice file, the audio quality of the voice file, the amount of correction required to bring the recognized text file into conformity with the finished text file, the amount of out of order text in the recognized text file, the amount of formatting in the associated text file, the confidence of the recognition engine in each of the tokens recognized and the ability of the recognition engine to generate a recognized text file using guided recognition, the time it takes the transcriptionist to edit a recognized text file to bring it into conformity with a finished text file, the number of keystrokes it takes the transcriptionist to edit a recognized text file to bring it into conformity with a finished text file, or any combination thereof;
determining a preferred mode of transcription based on a predetermined number of scores, wherein the predetermined number of scores are scores generated from recognized text files and associated text files corresponding to voice files generated by the speaker; and
setting a transcription mode for the speaker, wherein the transcription mode indicates the method of transcription thereafter used in the transcription system, and wherein the method of transcription is presenting a transcriptionist with a recognized text file for editing, presenting a transcriptionist with a voice file for transcribing, wherein non-speech noises have been removed from the voice file using automated speech recognition, or presenting the transcriptionist with a voice file for transcribing.
-
-
72. A transcription system comprising:
-
a voice server for accumulating a predetermined number of the speaker'"'"'s voice files, a text server for accumulating a predetermined number of associated text files, wherein the associated text files are transcriptions of the voice files transcribed by a transcriptionist;
a speech recognition engine, wherein a voice file is provided to said speech recognition engine and a recognized text file is generated therefrom;
a means for determining a score based on the recognized text file generated from the voice file and the associated text file oft he voice file, wherein the score i s based on an edit distance between the associated text file and the recognized text file, the number of pauses in the voice file, the amount of silence in the voice file, the audio quality of the voice file, the amount of correction required to bring the recognized text file into conformity with the finished text file, the amount of out of order text in the recognized text file, the amount of formatting in the associated text file, the confidence of the recognition engine in each of the tokens recognized and the ability of the recognition engine to generate a recognized text file using guided recognition, the time it takes the transcriptionist to edit a recognized text file to bring it into conformity with a finished text file, the number of keystrokes it takes the transcriptionist to edit a recognized text file to bring it into conformity with a finished text file, or any combination thereof;
a means for determining a preferred mode of transcription based on a predetermined number of scores, wherein the predetermined number of scores are scores generated from recognized text files and associated text files corresponding to voice files generated by the speaker; and
a means for setting a transcription mode for the speaker, wherein the transcription mode indicates the method of transcription thereafter used in the transcription system, and wherein the method of transcription is presenting a transcriptionist w with a r recognized text file for editing, presenting a transcriptionist with a voice file for transcribing, wherein non-speech noises have been removed from the voice file using automated speech recognition, or presenting the transcriptionist with a voice file for transcribing. - View Dependent Claims (73, 74)
-
Specification