Large vocabulary connected speech recognition system and method of language representation using evolutional grammer to represent context free grammars
First Claim
1. A speech recognition system for recognizing one or more speech inputs, said recognition system having no initial grammar, the system comprising:
- means for creating a grammar start state to initialize a grammar represented by a grammar network, which is comprised of arcs interconnecting nodes including a start node, predetermined initial word scores being assigned to the nodes, said predetermined initial word scores including a first predetermined initial word score assigned to the start node, and a second different predetermined initial word score assigned to at least one of the other nodes,means for dynamically creating word representations from the grammar start state for the speech inputs as the inputs are received, each of the word representations being represented by a respective one of the arcs in the grammar network,means for maintaining a score for each word representation created,means for propagating word scores meeting a threshold level through the grammar network,means for updating the word scores at each node other than the start node to maintain only active word representations having word scores above the threshold level,means for chaining word scores together which exceed the threshold level, andmeans for determining the chain of word scores which represents the speech input.
3 Assignments
0 Petitions
Accused Products
Abstract
A method of recognizing speech input selectively creates and maintains grammar representations of the speech input in essentially real time. Speech input frames are received by a speech recognition system. Grammar representations are created for each speech frame and a probability score is derived for the representations indicating the probability of the accuracy of the representations to the speech input. Representations having a probability score below a predetermined threshold are not maintained. Those grammar representations having probability scores above the predetermined threshold are maintained. As more speech frames are received by the system, additional grammar representations are created and the probability scores are updated. When the entire speech input has been received, the chain of grammar representations having the highest probability score is identified as the speech input.
-
Citations
12 Claims
-
1. A speech recognition system for recognizing one or more speech inputs, said recognition system having no initial grammar, the system comprising:
-
means for creating a grammar start state to initialize a grammar represented by a grammar network, which is comprised of arcs interconnecting nodes including a start node, predetermined initial word scores being assigned to the nodes, said predetermined initial word scores including a first predetermined initial word score assigned to the start node, and a second different predetermined initial word score assigned to at least one of the other nodes, means for dynamically creating word representations from the grammar start state for the speech inputs as the inputs are received, each of the word representations being represented by a respective one of the arcs in the grammar network, means for maintaining a score for each word representation created, means for propagating word scores meeting a threshold level through the grammar network, means for updating the word scores at each node other than the start node to maintain only active word representations having word scores above the threshold level, means for chaining word scores together which exceed the threshold level, and means for determining the chain of word scores which represents the speech input. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method of recognizing speech input signals comprising the steps of:
-
a) creating a grammar start state to initialize a grammar represented by a grammar network, which is comprised of arcs interconnecting nodes including a start node, predetermined initial word scores being assigned to the nodes, said predetermined initial word scores including a first predetermined initial word score assigned to the start node, and a second different predetermined initial word score assigned to at least one of the other nodes, b) dynamically creating word representations for said speech signals from the grammar start state, each of the word representations being represented by a respective one of the arcs in the grammar network, c) computing a word score for each word representation, d) comparing said word scores to a threshold level, e) chaining together those word representations having word scores above the threshold level to form phrase strings, f) determining which word scores are below the threshold value, g) destroying those word representations having word scores below the threshold level, h) updating the word score for each active word representation, i) computing, at a respective one of the nodes, phrase scores for each word string comprising word representations having word scores above the threshold level, j) repeating steps b)-i) until said entire speech signal has been inputted, and k) identifying the phrase string having the highest phrase score as the recognized speech input. - View Dependent Claims (8, 9, 10, 11)
-
-
12. A method of recognizing speech input by a speech recognition system comprising the steps of:
-
receiving a sequence of speech frames; creating a grammar representation for each speech frame, each grammar representation having a source node and an end node, the creating step including the step of assigning predetermined initial probability scores to respective source nodes of grammar representations of the speech frames, a predetermined initial probability score assigned to the source node of the grammar representation of the first frame in the sequence being different from that assigned to the source node of the grammar representation of at least another frame in the sequence; deriving a probability score for each grammar representation at the end node thereof indicating the probability of the accuracy of the representation to the speech input; selectively maintaining grammar representations having a probability score above a predetermined threshold; and chaining together representations having the highest probability score for each speech frame.
-
Specification