Combining Re-Speaking, Partial Agent Transcription and ASR for Improved Accuracy / Human Guided ASR
First Claim
1. A speech transcription system for producing a representative transcription text from one or more audio signals representing one or more speakers participating in a speech session, the system comprising:
- a preliminary transcription module for developing a preliminary transcription of the speech session using automatic speech recognition having a preliminary recognition accuracy performance;
a speech selection module for user selection of one or more portions of the preliminary transcription to receive higher accuracy transcription processing; and
a final transcription module responsive to the user selection for developing a final transcription output for the speech session having a final recognition accuracy performance for the selected one or more portions which is higher than the preliminary recognition accuracy performance.
2 Assignments
0 Petitions
Accused Products
Abstract
A speech transcription system is described for producing a representative transcription text from one or more different audio signals representing one or more different speakers participating in a speech session. A preliminary transcription module develops a preliminary transcription of the speech session using automatic speech recognition having a preliminary recognition accuracy performance. A speech selection module enables user selection of one or more portions of the preliminary transcription to receive higher accuracy transcription processing. A final transcription module is responsive to the user selection for developing a final transcription output for the speech session having a final recognition accuracy performance for the selected one or more portions which is higher than the preliminary recognition accuracy performance.
-
Citations
26 Claims
-
1. A speech transcription system for producing a representative transcription text from one or more audio signals representing one or more speakers participating in a speech session, the system comprising:
-
a preliminary transcription module for developing a preliminary transcription of the speech session using automatic speech recognition having a preliminary recognition accuracy performance; a speech selection module for user selection of one or more portions of the preliminary transcription to receive higher accuracy transcription processing; and a final transcription module responsive to the user selection for developing a final transcription output for the speech session having a final recognition accuracy performance for the selected one or more portions which is higher than the preliminary recognition accuracy performance. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A speech transcription system for producing a representative transcription text from one or more audio signals representing one or more speakers participating in a speech session, the system comprising:
-
a keyword transcript module for real time processing of one or more speech signals by a human agent to generate a partial transcript of keywords; a transcript alignment module for time aligning the partial transcript with the one or more speech signals; a speech transcription module for performing automatic speech recognition of the one or more speech signals as constrained by the time aligned partial transcript to produce a final transcription output for the speech session containing the keywords. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A speech transcription system for producing a representative transcription text from one or more audio signals representing one or more speakers participating in a speech session, the system comprising:
-
a session monitoring module for user monitoring of the one or more audio signals; a user re-speak module for receiving a user re-speaking of at least a portion of the speech session; a session ASR module for generating a session recognition result corresponding to the one or more audio signals for the speech session; a re-speak ASR module for generating a re-speak recognition result corresponding to the user re-speaking; a session transcription module for combining the session recognition result and the re-speak recognition result to develop a final transcription output for the speech session. - View Dependent Claims (21, 22, 23, 24, 25, 26)
-
Specification