System and method for natural language processing
First Claim
Patent Images
1. A system for improving accuracy of natural language processing, the system comprising:
- a natural language input device;
a plurality of speech recognition engines for automatic speech recognition functions only, the plurality of speech recognition engines being connected to the input device, the plurality of speech recognition engines receive an input from the input device and presents a speech recognition result as part of a set of speech recognition results;
a data fusion model to receive the set of speech recognition results and to identify a correct result from the set of speech recognition results, the correct result being identified as a result in the set of speech recognition results that has the highest probability of being a correct result from the plurality of speech recognition engines;
the data fusion model to receive all of the speech recognition results and to determine a correct result from speech recognition results, the determined correct result being selected from a result in the set of speech recognition results that has a low probability of being a correct result and is manually determined to be a normal expression of the input from the input device;
a semantic understanding model, separate and distinct from the plurality of speech recognition engines, to process the identified correct result and the determined correct result;
a corpora created from the processed correct result;
a corpus arranged from a plurality of the corpora; and
the data fusion model being updated by the corpus.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for improving accuracy of natural language processing using a plurality of speech recognition engines, a data fusion model to identify a correct result from the plurality of speech recognition engines and a semantic understanding model, separate and distinct from the speech recognition model, to process the correct results. A corpus is developed using the correct results and the corpus is used to train the data fusion model and the semantic understanding model.
-
Citations
6 Claims
-
1. A system for improving accuracy of natural language processing, the system comprising:
-
a natural language input device; a plurality of speech recognition engines for automatic speech recognition functions only, the plurality of speech recognition engines being connected to the input device, the plurality of speech recognition engines receive an input from the input device and presents a speech recognition result as part of a set of speech recognition results; a data fusion model to receive the set of speech recognition results and to identify a correct result from the set of speech recognition results, the correct result being identified as a result in the set of speech recognition results that has the highest probability of being a correct result from the plurality of speech recognition engines; the data fusion model to receive all of the speech recognition results and to determine a correct result from speech recognition results, the determined correct result being selected from a result in the set of speech recognition results that has a low probability of being a correct result and is manually determined to be a normal expression of the input from the input device; a semantic understanding model, separate and distinct from the plurality of speech recognition engines, to process the identified correct result and the determined correct result; a corpora created from the processed correct result; a corpus arranged from a plurality of the corpora; and the data fusion model being updated by the corpus. - View Dependent Claims (2)
-
-
3. A method for natural language processing in a system having a natural language input device, a plurality of speech recognition engines, a data fusion model and a semantic understanding model, the method carried out in a processor having computer executable instructions for performing the steps of:
-
receiving, at the natural language input device, an input sentence; processing the input sentence at the plurality of speech recognition engines, each of the plurality of speech recognition engines producing a result that is part of a set of results for all of the speech recognition engines; recording all of the results from the plurality of speech recognition engines to develop a corpora; applying the data fusion model to identify a correct result from the set of results, the correct result being identified as a result in the set of speech recognition results that has the highest probability of being a correct result; applying the data fusion model to determine a correct result from all of the results, the correct result being determined from one or more results from the set of results for the input sentence that has a low probability of being a correct result, determining manually that the input sentence is a normal expression, and adding the input sentence to the developed corpora; processing the identified correct result and the determined correct result in the semantic understanding model; collecting the processed correct results in a corpus; and updating the data fusion model using the corpus. - View Dependent Claims (4)
-
-
5. A non-transitory computer readable storage medium comprising a program, which, when executed by one or more processors, performs an operation comprising:
-
processing an input sentence received by an input device using a plurality of speech recognition engines; recording all of the results from the plurality of speech recognition engines to develop a corpora; producing a set of results that includes all results for each speech recognition engine in the plurality of speech recognition engines; applying a data fusion model to the set of results to identify a correct result from the set of results; applying the data fusion model to all of the results to determine a correct results from all of the results; processing the identified correct result in the semantic understanding model, the identified correct result being identified as a result in the set of speech recognition results that has the highest probability of being a correct result; processing the determined correct result in the semantic understanding model, the determined correct result being determined from a result in the set of speech recognition results that has a low probability of being a correct result, determining manually that the input sentence is a normal expression, and adding the input sentence to the developed corpora; and updating the data fusion model using the processed correct results. - View Dependent Claims (6)
-
Specification