Machine learning-based prediction of transcriber performance on a segment of audio
First Claim
1. A system configured to calculate an expected accuracy of a transcription by a certain transcriber, comprising:
- a computer configured to;
receive a segment of an audio recording, which comprises speech of a person;
identify, based on the segment, an accent of the person;
identify, based on a transcription of the segment generated using an automatic speech recognition (ASR) system, a topic of the segment;
generate feature values based on data comprising an indication of the accent and an indication of the topic; and
utilize a model to calculate, based on the feature values, a value indicative of an expected accuracy of a transcription of the segment by the certain transcriber;
wherein the model is generated based on training data comprising feature values generated based on segments of previous audio recordings, and values of accuracies of transcriptions, by the certain transcriber, of the segments.
3 Assignments
0 Petitions
Accused Products
Abstract
When transcribing large audio files, such as in the case of legal depositions, there are often many transcribers to choose from. Embodiments described herein enable calculation of expected accuracy of transcriptions by transcribers, which can be used to guide the selection of transcribers for specific tasks. In one embodiment, a computer receives a segment of an audio recording that includes speech of a person, and identifies an accent of the person and a topic of the segment. The computer generates feature values based on data that includes the accent and the topic, and utilizes a model to calculate, based on the feature values, an expected accuracy of a transcription of the segment by a certain transcriber. The model is generated based on training data that includes segments of previous audio recordings and values of accuracies of transcriptions, by the certain transcriber, of the segments.
65 Citations
20 Claims
-
1. A system configured to calculate an expected accuracy of a transcription by a certain transcriber, comprising:
-
a computer configured to; receive a segment of an audio recording, which comprises speech of a person; identify, based on the segment, an accent of the person; identify, based on a transcription of the segment generated using an automatic speech recognition (ASR) system, a topic of the segment; generate feature values based on data comprising an indication of the accent and an indication of the topic; and utilize a model to calculate, based on the feature values, a value indicative of an expected accuracy of a transcription of the segment by the certain transcriber; wherein the model is generated based on training data comprising feature values generated based on segments of previous audio recordings, and values of accuracies of transcriptions, by the certain transcriber, of the segments. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method for calculating an expected accuracy of a transcription by a certain transcriber, comprising:
-
receiving a segment of an audio recording, which comprises speech of a person; identifying, based on the segment, an accent of the person; identifying, based on a transcription of the segment generated using an automatic speech recognition (ASR) system, a topic of the segment; generating feature values based on data comprising an indication of the accent and an indication of the topic; and utilizing a model to calculate, based on the feature values, a value indicative of an expected accuracy of a transcription of the segment by the certain transcriber; wherein the model is generated based on training data comprising feature values generated based on segments of previous audio recordings, and values of accuracies of transcriptions, by the certain transcriber, of the segments. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
-
-
20. A non-transitory computer-readable medium having instructions stored thereon that, in response to execution by a system including a processor and memory, causes the system to perform operations comprising:
-
receiving a segment of an audio recording, which comprises speech of a person; identifying, based on the segment, an accent of the person; identifying, based on a transcription of the segment generated using an automatic speech recognition (ASR) system, a topic of the segment; generating feature values based on data comprising an indication of the accent and an indication of the topic; and utilizing a model to calculate, based on the feature values, a value indicative of an expected accuracy of a transcription of the segment by a certain transcriber; wherein the model is generated based on training data comprising feature values generated based on segments of previous audio recordings, and values of accuracies of transcriptions, by the certain transcriber, of the segments.
-
Specification