Acoustic model training
First Claim
1. A method, executed by one or more processors, the method comprising:
- conducting speech recognition on a channel recording of a conversation to provide time boundaries and written language corresponding to utterances within the channel recording;
determining sentence or phrase boundaries for a transcription of the conversation;
aligning written language within the one or more transcriptions with the written language corresponding to the utterances with the channel recording to provide sentence or phrase boundaries for the channel recording; and
training a speech recognizer according to the sentence or phrase boundaries for the transcription and the sentence or phrase boundaries for the channel recording.
1 Assignment
0 Petitions
Accused Products
Abstract
A method, executed by a computer, includes receiving a channel recording corresponding to a conversation, receiving a transcription for the conversation, generating a conversation-specific language model for the conversation using the transcription, and conducting speech recognition on the channel recording using the conversation-specific language model to provide time boundaries and written language corresponding to utterances within the channel recording. The method further includes determining sentence or phrase boundaries for the transcription, aligning written language within the one or more transcriptions with the written language corresponding to the utterances with the channel recording to provide sentence or phrase boundaries for the channel recording, and training a speech recognizer according to the sentence or phrase boundaries for the transcription and the sentence or phrase boundaries for the channel recording. A computer system and computer program product corresponding to the method are also disclosed herein.
19 Citations
20 Claims
-
1. A method, executed by one or more processors, the method comprising:
-
conducting speech recognition on a channel recording of a conversation to provide time boundaries and written language corresponding to utterances within the channel recording; determining sentence or phrase boundaries for a transcription of the conversation; aligning written language within the one or more transcriptions with the written language corresponding to the utterances with the channel recording to provide sentence or phrase boundaries for the channel recording; and training a speech recognizer according to the sentence or phrase boundaries for the transcription and the sentence or phrase boundaries for the channel recording. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer system comprising:
-
one or more computer processors; one or more computer readable storage media and program instructions stored on the one or more computer readable storage media, the program instructions comprising instructions to perform; conducting speech recognition on a channel recording of a conversation to provide time boundaries and written language corresponding to utterances within the channel recording; determining sentence or phrase boundaries for a transcription of the conversation; aligning written language within the one or more transcriptions with the written language corresponding to the utterances with the channel recording to provide sentence or phrase boundaries for the channel recording; and training a speech recognizer according to the sentence or phrase boundaries for the transcription and the sentence or phrase boundaries for the channel recording. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
-
-
20. A method, executed by one or more processors, the method comprising:
-
conducting speech recognition on concurrent channel recordings of a conversation using at least one conversation specific language model to provide time boundaries and written language corresponding to utterances within the concurrent channel recordings; determining sentence or phrase boundaries for a transcription of the conversation; aligning written language within one or more transcriptions corresponding to the concurrent channel recordings with the written language corresponding to the utterances with the concurrent channel recording to provide sentence or phrase boundaries for the concurrent channel recordings; and training a speech recognizer according to the sentence or phrase boundaries for the transcription and the sentence or phrase boundaries for the concurrent channel recordings.
-
Specification