Method and apparatus for automatic grammar generation from data entries
First Claim
Patent Images
1. A computer-implemented method of generating a speech recognition grammar, the method comprising:
- using a processor to automatically generate a simulated recognition search tree representing items in a data set, wherein the simulated recognition search tree representing items in the data set comprises nodes and arcs between nodes, each arc representing a word from the data set which can be recognized, and wherein generating the simulated recognition search tree comprises;
using the processor to build the simulated recognition search tree such that each word to be recognized in the data set is represented by an arc between nodes in the simulated recognition search tree by labeling each arc with its corresponding word to be recognized and by also labeling each arc with a weight which comprises a count of the phrases in the data set which share the word corresponding to the arc and which are represented by a path in the arc;
using the processor to determine whether a phrase belonging to the data set is a complete expression to be accepted by a node in the simulated recognition search tree; and
using the processor to identify the node corresponding to the complete expression as a terminal node of the automatically generated simulated recognition search tree if the phrase is determined to be a complete expression;
using the processor to store, for each terminal node of the automatically generated simulated recognition search tree, semantic markup language (SML), to be returned by a speech recognition engine in response to the complete expression, in association with the terminal node of the automatically generated simulated recognition search tree;
using the processor to identify terminal nodes of the automatically generated simulated recognition search tree with two complete expressions which reach the terminal node, but with different SML to return, as being indicative of collisions;
using the processor to automatically generate the speech recognition grammar using the simulated recognition search tree; and
using the processor to store the speech recognition grammar on a computer storage medium for use in speech recognition.
2 Assignments
0 Petitions
Accused Products
Abstract
A method of generating an optimized grammar, for use in speech recognition, from a data set or big list of items, is disclosed. The method includes the steps of obtaining a tree representing items in the data set, and generating the grammar using the tree. The tree or tree data structure representing items in the data set is a simulated recognition search tree, representing items in the data set, which can be automatically generated from the data set.
334 Citations
16 Claims
-
1. A computer-implemented method of generating a speech recognition grammar, the method comprising:
-
using a processor to automatically generate a simulated recognition search tree representing items in a data set, wherein the simulated recognition search tree representing items in the data set comprises nodes and arcs between nodes, each arc representing a word from the data set which can be recognized, and wherein generating the simulated recognition search tree comprises; using the processor to build the simulated recognition search tree such that each word to be recognized in the data set is represented by an arc between nodes in the simulated recognition search tree by labeling each arc with its corresponding word to be recognized and by also labeling each arc with a weight which comprises a count of the phrases in the data set which share the word corresponding to the arc and which are represented by a path in the arc; using the processor to determine whether a phrase belonging to the data set is a complete expression to be accepted by a node in the simulated recognition search tree; and using the processor to identify the node corresponding to the complete expression as a terminal node of the automatically generated simulated recognition search tree if the phrase is determined to be a complete expression; using the processor to store, for each terminal node of the automatically generated simulated recognition search tree, semantic markup language (SML), to be returned by a speech recognition engine in response to the complete expression, in association with the terminal node of the automatically generated simulated recognition search tree; using the processor to identify terminal nodes of the automatically generated simulated recognition search tree with two complete expressions which reach the terminal node, but with different SML to return, as being indicative of collisions; using the processor to automatically generate the speech recognition grammar using the simulated recognition search tree; and using the processor to store the speech recognition grammar on a computer storage medium for use in speech recognition. - View Dependent Claims (2, 3, 4)
-
-
5. A computer storage medium having stored thereon computer-executable instructions for implementing speech recognition context free grammar generating steps comprising:
-
automatically generating a simulated recognition search tree from a data set, including building the simulated recognition search tree such that each word to be recognized in the data set is represented by an arc between nodes in the simulated recognition search tree and a weight of the word to be recognized, building the simulated recognition search tree further comprising labeling each arc with its corresponding word to be recognized and labeling each arc with the weight of the word to be recognized, wherein the weight comprises a count of the phrases in the data set which share the word corresponding to the arc and which are represented by a path including the arc; generating the speech recognition context free grammar using the tree; and storing the speech recognition context free grammar on a computer storage medium for use in speech recognition. - View Dependent Claims (6, 7, 8, 9, 10, 11)
-
-
12. A grammar generation system, embodied in a computer, for generating a speech recognition context free grammar for use by a speech recognition engine, the system comprising:
a processing unit and computer storage medium, the computer storage medium having stored thereon computer-executable instructions which configure the processing unit to implement grammar generation system components comprising; a search tree module configured to automatically generate a simulated recognition search tree representing items in a data set, including building the simulated recognition search tree such that each word to be recognized in the data set is represented by an arc between nodes in the simulated recognition search tree and a weight of the word to be recognized which is a count of phrases in the data set which share the word corresponding to the arc and which are represented by a path including the arc; and a grammar producing module configured to generate the speech recognition context free grammar using the generated simulated recognition search tree and to provide the speech recognition context free grammar as an output for use by a speech recognition engine in performing speech recognition. - View Dependent Claims (13, 14, 15, 16)
Specification