Discriminative training of automatic speech recognition models with natural language processing dictionary for spoken language processing
First Claim
Patent Images
1. A method for language processing, comprising:
- training one or more automatic speech recognition models using an automatic speech recognition dictionary and speech recognition training data;
determining a set of N automatic speech recognition hypotheses that characterize a spoken input, based on the one or more automatic speech recognition models, using a processor;
selecting a hypothesis from the set of N automatic speech recognition hypotheses using a discriminative language model and a first natural language processing dictionary that excludes words having little discriminatory value according to an error rate of only words other than words having little likely effect on the natural language outcome in each hypothesis; and
performing natural language processing on the selected hypothesis using a second natural language processing dictionary that is different from the automatic speech recognition dictionary and the first natural language processing dictionary.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods and systems for language processing includes training one or more automatic speech recognition models using an automatic speech recognition dictionary. A set of N automatic speech recognition hypotheses for an input is determined, based on the one or more automatic speech recognition models, using a processor. A best hypothesis is selected using a discriminative language model and a list of relevant words. Natural language processing is performed on the best hypothesis.
17 Citations
18 Claims
-
1. A method for language processing, comprising:
-
training one or more automatic speech recognition models using an automatic speech recognition dictionary and speech recognition training data; determining a set of N automatic speech recognition hypotheses that characterize a spoken input, based on the one or more automatic speech recognition models, using a processor; selecting a hypothesis from the set of N automatic speech recognition hypotheses using a discriminative language model and a first natural language processing dictionary that excludes words having little discriminatory value according to an error rate of only words other than words having little likely effect on the natural language outcome in each hypothesis; and performing natural language processing on the selected hypothesis using a second natural language processing dictionary that is different from the automatic speech recognition dictionary and the first natural language processing dictionary. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A method for language processing, comprising:
-
training one or more automatic speech recognition models using an automatic speech recognition dictionary and speech recognition training data; determining a set of N automatic speech recognition hypotheses that characterize a spoken input, based on the one or more automatic speech recognition models, using a processor, comprising; concatenating all words in the natural language processing training data to generate raw natural language processing text; tokenizing the raw natural language processing text using the automatic speech recognition dictionary; collecting tokenized words that appear more than a threshold number of times; and forming a first natural language processing dictionary that excludes words having little discriminatory value using the collected tokenized words; selecting a hypothesis from the set of N automatic speech recognition hypotheses using a discriminative language model and the first natural language processing dictionary input according to an error rate of only words other than words having little likely effect on the natural language processing outcome in each hypothesis, said selection comprising; determining a word error rate for each hypothesis that considers only the first natural language processing dictionary using the discriminative language model; and selecting the hypothesis having the lowest word error rate; and performing natural language processing on the selected hypothesis using a second natural language processing dictionary that is different from the automatic speech recognition dictionary and the first natural language processing dictionary.
-
-
11. A system for language processing, comprising:
-
an automatic speech recognition module comprising a processor configured to train one or more automatic speech recognition models using an automatic speech recognition dictionary and speech recognition training data, to determine a set of N automatic speech recognition hypotheses that characterize a spoken input based on the one or more automatic speech recognition models, and to select a hypothesis from the set of N automatic speech recognition hypotheses using a discriminative language model and first natural language processing dictionary that excludes words having little discriminatory value according to an error rate of only words other than words having little likely effect on the natural language processing outcome in each hypothesis; and a natural language processing module configured to perform natural language processing on the selected hypothesis using a second natural language processing dictionary that is different from the automatic speech recognition dictionary and the first natural language processing dictionary. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18)
-
Specification