MATERIAL SELECTION FOR LANGUAGE MODEL CUSTOMIZATION IN SPEECH RECOGNITION FOR SPEECH ANALYTICS
First Claim
Patent Images
1. A method for extracting, from non-speech text, training data for a language model for speech recognition, the method comprising:
- receiving, by a processor, non-speech text;
selecting, by the processor, text from the non-speech text;
converting, by the processor, the selected text to generate converted text comprising a plurality of phrases consistent with speech transcription text;
training, by the processor, a language model using the converted text; and
outputting, by the processor, the language model.
2 Assignments
0 Petitions
Accused Products
Abstract
A method for extracting, from non-speech text, training data for a language model for speech recognition includes: receiving, by a processor, non-speech text; selecting, by the processor, text from the non-speech text; converting, by the processor, the selected text to generate converted text comprising a plurality of phrases consistent with speech transcription text; training, by the processor, a language model using the converted text; and outputting, by the processor, the language model.
31 Citations
18 Claims
-
1. A method for extracting, from non-speech text, training data for a language model for speech recognition, the method comprising:
-
receiving, by a processor, non-speech text; selecting, by the processor, text from the non-speech text; converting, by the processor, the selected text to generate converted text comprising a plurality of phrases consistent with speech transcription text; training, by the processor, a language model using the converted text; and outputting, by the processor, the language model. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A method for selecting, from non-speech text, training data for a language model for speech recognition, the method comprising:
-
training, by a processor, a non-speech language model based on the non-speech text; for each unique sentence of the non-speech text; computing and normalizing, by the processor, an out-of-domain score of the unique sentence based on non-speech language model; computing and normalizing, by the processor, an in-domain score of the unique sentence based on a speech transcription language model trained based on generic speech transcription training data; comparing, by the processor, the out-of-domain score to the in-domain score; and adding, by the processor, the unique sentence to an output set of selected text in response to determining that the in-domain score exceeds the out-of-domain score by a threshold; and outputting, by the processor, the output set of selected text. - View Dependent Claims (7)
-
-
8. A method for selecting, from non-speech text, training data for a language model for speech recognition, the method comprising:
-
initializing, by a processor, an output set of selected text based a plurality of sentences sampled from the non-speech text; for each unique sentence of the non-speech text; computing, by the processor, a first divergence between an in-domain language model trained on generic speech transcript text the unique sentence and a language model trained on the output set; computing, by the processor, a second divergence between the in-domain language model and a language model trained on the output set combined with the unique sentence; comparing, by the processor, the first divergence and the second divergence; and adding, by the processor, the sentence to the output set in response to determining that the second divergence in less than the first divergence; and outputting, by the processor, the output set of selected text. - View Dependent Claims (9)
-
-
10. A system comprising:
-
a processor; memory storing instructions that, when executed by the processor, cause the processor to; receive non-speech text; select text from the non-speech text; convert the selected text to generate converted text comprising a plurality of phrases consistent with speech transcription text; train a language model using the converted text; and output the language model. - View Dependent Claims (11, 12, 13, 14)
-
-
15. A system comprising:
-
a processor; and memory storing instructions that, when executed by the processor, cause the processor to; train a non-speech language model based on the non-speech text; for each unique sentence of the non-speech text; compute and normalize an out-of-domain score of the unique sentence based on non-speech language model; compute and normalize an in-domain score of the unique sentence based on a speech transcription language model trained based on generic speech transcription training data; compare the out-of-domain score to the in-domain score; and add the unique sentence to an output set of selected text in response to determining that the in-domain score exceeds the out-of-domain score by a threshold; and output the output set of selected text. - View Dependent Claims (16)
-
-
17. A system comprising:
-
a processor; and memory storing instructions that, when executed by the processor, cause the processor to; initialize an output set of selected text based a plurality of sentences sampled from the non-speech text; for each unique sentence of the non-speech text; compute a first divergence between an in-domain language model trained on generic speech transcript text the unique sentence and a language model trained on the output set; compute a second divergence between the in-domain language model and a language model trained on the output set combined with the unique sentence; compare the first divergence and the second divergence; and add the sentence to the output set in response to determining that the second divergence in less than the first divergence; and output the output set of selected text. - View Dependent Claims (18)
-
Specification