Enhancement to Viterbi speech processing algorithm for hybrid speech models that conserves memory
First Claim
1. A speech processing method comprising:
- generating, with at least one computer system, a search space from an N-gram language model with N greater than two, wherein the search space comprises a plurality of nodes including at least one grammar node that represents within the search space, an embedded grammar that is utilized in a plurality of contexts; and
associating a grammar identifier that is uniquely associated with the embedded grammar with the at least one grammar node, wherein the same grammar identifier is used for each of the plurality of contexts, said grammar identifier referencing a recursive transition network corresponding to the embedded grammar.
2 Assignments
0 Petitions
Accused Products
Abstract
The present invention discloses a method for semantically processing speech for speech recognition purposes. The method can reduce an amount of memory required for a Viterbi search of an N-gram language model having a value of N greater than two and also having at least one embedded grammar that appears in a multiple contexts to a memory size of approximately a bigram model search space with respect to the embedded grammar. The method also reduces needed CPU requirements. Achieved reductions can be accomplished by representing the embedded grammar as a recursive transition network (RTN), where only one instance of the recursive transition network is used for the contexts. Other than the embedded grammars, a Hidden Markov Model (HMM) strategy can be used for the search space.
18 Citations
18 Claims
-
1. A speech processing method comprising:
-
generating, with at least one computer system, a search space from an N-gram language model with N greater than two, wherein the search space comprises a plurality of nodes including at least one grammar node that represents within the search space, an embedded grammar that is utilized in a plurality of contexts; and associating a grammar identifier that is uniquely associated with the embedded grammar with the at least one grammar node, wherein the same grammar identifier is used for each of the plurality of contexts, said grammar identifier referencing a recursive transition network corresponding to the embedded grammar. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A speech recognition decoder comprising:
at least one processor programmed to; generate a finite state machine search space from an N-gram language model with N greater than two, wherein the N-gram language model includes at least one embedded grammar, wherein said finite state machine search space includes statistical language model (SLM) nodes and grammar nodes, each grammar node representing a state associated with one of the at least one embedded grammar; process the SLM nodes using a Hidden Markov Model (HMM) based strategy; and process the grammar nodes using a Recursive Transition Network (RTN) based strategy, wherein the finite state machine search space includes a single grammar node for each of the embedded grammars regardless of a number of contexts in which each of the embedded grammars is represented in the N-gram language model. - View Dependent Claims (12, 13, 14, 15, 16, 17)
-
18. A method for semantically processing speech for speech recognition, the method comprising:
representing, with at least one computer system, an embedded grammar as a recursive transition network, wherein the embedded grammar is used in a plurality of contexts in an N-gram language model with N greater than two, wherein a single instance of the recursive transition network is used for the plurality of contexts.
Specification