Search optimization system and method for continuous speech recognition
First Claim
1. A method for continuous speech recognition, comprising:
- receiving an input signal derived from a spoken utterance of a set of words;
providing a search network indicative of a plurality of recognizable words, the search network comprising a plurality of interconnected words, a given interconnection being established at least in part based on a connected word grammar, providing semantic information;
detecting in the search network connected word grammars bounded by semantically null words;
generating a modified search network by performing a process comprising;
collapsing each list of semantically null words into a unique single-input single-output search network portion, and identifying stop nodes in the search network, the modified search network being indicative of a plurality of non-semantically-null words; and
processing the input signal at least in part based on the modified search network to derive a list of N-best salient words that potentially match at least one word of the spoken utterance.
6 Assignments
0 Petitions
Accused Products
Abstract
A system and method for continuous speech recognition (CSR) is optimized to reduce processing time for connected word grammars bounded by semantically null words. The savings, which reduce processing time both during the forward and the backward passes of the search, as well as during rescoring, are achieved by performing only the minimal amount of computation required to produce an exact N-best list of semantically meaningful words (N-best list of salient words). This departs from the standard Spoken Language System modeling which any notion of meaning is handled by the Natural Language Understanding (NLU) component. By expanding the task of the recognizer component from a simple acoustic match to allow semantic information to be fed to the recognizer, significant processing time savings are achieved, and make it possible to run an increased number of speech recognition channels in parallel for improved performance, which may enhance users perception of value and quality of service.
109 Citations
9 Claims
-
1. A method for continuous speech recognition, comprising:
-
receiving an input signal derived from a spoken utterance of a set of words;
providing a search network indicative of a plurality of recognizable words, the search network comprising a plurality of interconnected words, a given interconnection being established at least in part based on a connected word grammar, providing semantic information;
detecting in the search network connected word grammars bounded by semantically null words;
generating a modified search network by performing a process comprising;
collapsing each list of semantically null words into a unique single-input single-output search network portion, and identifying stop nodes in the search network, the modified search network being indicative of a plurality of non-semantically-null words; and
processing the input signal at least in part based on the modified search network to derive a list of N-best salient words that potentially match at least one word of the spoken utterance. - View Dependent Claims (2, 3, 4)
during the forward pass of the search, detecting forward stop nodes in the modified search network and signaling the search to stop forward scoring along a path currently being followed; and
during the backward pass of the search, detecting backward stop nodes in the modified search network and signaling the search to stop backward scoring along a path currently being followed.
-
-
4. The method of claim 3, wherein scoring comprises Viterbi scoring.
-
5. Software on a machine readable medium for performing a method for continuous speech recognition, the method comprising the steps of:
-
providing an input signal derived from a spoken utterance of a set of words;
providing a search network indicative of a plurality of recognizable words, the search network comprising a plurality of interconnected words, a given interconnection being established at least in part based on a connected word grammar, providing semantic information;
detecting in the search network connected word grammars bounded by semantically null words;
generating a modified search network by performing a process comprising;
collapsing each list of semantically null words into a unique single-input single-output search network portion, and identifying stop nodes in the search network, the modified search network being indicative of a plurality of non-semantically-null words; and
processing the input signal at least in part based on the modified search network to derive a list of N-best salient words that potentially match at least one word of the spoken utterance.
-
-
6. A system for continuous speech recognition, comprising:
-
an input for receiving an input signal derived from a spoken utterance of a set of words;
means for providing a search network indicative of a plurality of recognizable words, the search network comprising a plurality of interconnected words, a given interconnection being established at least in part based on a connected word grammar, means for providing, semantic information;
means for detecting in the search network connected word grammars bounded by semantically null words;
means for generating a modified search network by performing a process comprising;
collapsing each list of semantically null words into a unique single-input single-output search network portion, and identifying stop nodes in the search network, the modified search network being indicative of a plurality of non-semantically-null words; and
means for processing the input signal at least in pair based on the modified search network to derive a list of N-best salient words that potentially match at least one word of the spoken utterance. - View Dependent Claims (7, 8, 9)
during the forward pass of the search, the second processing means being operative for detecting forward stop nodes in the modified search network and signaling the search to stop forward scoring along a path currently being followed; and
during the backward pass of the search, the second processing means being operative for detecting backward stop nodes in the modified search network and signaling the search to stop backward scoring along a path currently being followed.
-
-
9. The system of claim 8, wherein scoring comprises Viterbi scoring.
Specification