Grammar confusability metric for speech recognition
First Claim
1. A computer-implemented system that facilitates speech recognition, comprising:
- a vector component for generating feature vectors that approximate acoustical properties of an input term;
a metric component for recognition processing of the feature vectors based on multiple iterations and generating multiple iteration confusability metrics respectively for each of the multiple iterations; and
an aggregation component for aggregating the multiple iteration confusability metrics and generating an overall confusability metric based on the multiple iterations of recognition processing of the feature vectors.
2 Assignments
0 Petitions
Accused Products
Abstract
Architecture for testing an application grammar for the presence of confusable terms. A grammar confusability metric (GCM) is generated for describing a likelihood that a reference term will be confused by the speech recognizer with another term phrase currently allowed by active grammar rules. The GCM is used to flag processing of two phrases in the grammar that have different semantic meaning, but that the speech recognizer could have difficulty distinguishing reliably. A built-in acoustic model is analyzed and feature vectors generated that are close to the acoustic properties of the input term. The feature vectors are then sent for recognition. A statistically random sampling method is applied to explore the acoustic properties of feature vectors of the input term phrase spatially and temporally. The feature vectors are perturbed in the neighborhood of the time domain and the Gaussian mixture model to which the feature vectors belong.
32 Citations
19 Claims
-
1. A computer-implemented system that facilitates speech recognition, comprising:
-
a vector component for generating feature vectors that approximate acoustical properties of an input term; a metric component for recognition processing of the feature vectors based on multiple iterations and generating multiple iteration confusability metrics respectively for each of the multiple iterations; and an aggregation component for aggregating the multiple iteration confusability metrics and generating an overall confusability metric based on the multiple iterations of recognition processing of the feature vectors. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer-implemented method of performing speech recognition employing a computer programmed to perform the method, comprising:
-
converting an input term into a set of senone IDs; randomly selecting feature vectors that are representative of distributions of the set of senone IDs; driving a recognition process using the feature vectors to output a result; perturbing the feature vectors in at least one of spatially or temporally for neighboring samples; and aggregating results from multiple iterations of the input term into an overall confusability metric. - View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A computer-implemented system, comprising:
-
computer-implemented means for converting an input term into a set of senone IDs; computer-implemented means for randomly selecting feature vectors that are representative of distributions of the set of senone IDs; computer-implemented means for driving a recognition process using the feature vectors to output a result; computer-implemented means for perturbing the feature vectors in at least one of spatially or temporally for neighboring samples; and computer-implemented means for aggregating results from multiple iterations of the input term into an overall confusability metric.
-
Specification