Robust information extraction from utterances
First Claim
1. A method for classifying utterances, comprising the steps of:
- providing access to a speech recognition engine;
receiving, by the speech recognition engine, an utterance as input;
using an automatic speech recognizer to covert the utterance into a text string;
generating an action-based concept tagged document of the text string;
classifying the text string into one or more semantic classes based at least in part on the action-based concept tagged document;
predicting paraphrased representations of the input with respect to a source language based on the one or more semantic classes; and
outputting a top candidate of the paraphrased representations in a target language.
3 Assignments
0 Petitions
Accused Products
Abstract
The performance of traditional speech recognition systems (as applied to information extraction or translation) decreases significantly with, larger domain size, scarce training data as well as under noisy environmental conditions. This invention mitigates these problems through the introduction of a novel predictive feature extraction method which combines linguistic and statistical information for representation of information embedded in a noisy source language. The predictive features are combined with text classifiers to map the noisy text to one of the semantically or functionally similar groups. The features used by the classifier can be syntactic, semantic, and statistical.
328 Citations
19 Claims
-
1. A method for classifying utterances, comprising the steps of:
-
providing access to a speech recognition engine; receiving, by the speech recognition engine, an utterance as input; using an automatic speech recognizer to covert the utterance into a text string; generating an action-based concept tagged document of the text string; classifying the text string into one or more semantic classes based at least in part on the action-based concept tagged document; predicting paraphrased representations of the input with respect to a source language based on the one or more semantic classes; and outputting a top candidate of the paraphrased representations in a target language. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
Specification