Network and language models for use in a speech recognition system
First Claim
1. A language model structure for use in a speech recognition system employing a tree-structured network model, the language model comprising identifiers with associated language model probabilities, the language model being structured such that identifiers associated with each word and contained therein are arranged such that each node of the network model with which the language model is associated spans a continuous range of identifiers and associated language model probabilities in the language model structure.
2 Assignments
0 Petitions
Accused Products
Abstract
A language model structure for use in a speech recognition system employs a tree-structured network model. The language model is structured such that identifiers associated with each word and contained therein are arranged such that each node of the network model with which the language model is associated spans a continuous range of identifiers. A method of transferring tokens through a tree-structured network in a speech recognition process is also provided.
46 Citations
11 Claims
- 1. A language model structure for use in a speech recognition system employing a tree-structured network model, the language model comprising identifiers with associated language model probabilities, the language model being structured such that identifiers associated with each word and contained therein are arranged such that each node of the network model with which the language model is associated spans a continuous range of identifiers and associated language model probabilities in the language model structure.
-
3. A tree-structured network for use in a speech recognition system, the tree-structured network comprising:
-
a first tree-structured section representing the first phone of each word having two or more phones;
a second tree-structured section representing within word phones, wherein within word phones includes any phone between the first phone and the last phone of a word;
a third tree-structured section representing the last or only phone of each word;
a fourth tree-structured section representing inter-word silences; and
,a number of null nodes for joining each tree-structured section to the following tree-structured section. - View Dependent Claims (4)
-
-
5. A method of transferring tokens through a tree-structured network in a speech recognition process, each token including a likelihood which indicates the probability of a respective path through the network representing a respective word to be recognised, and wherein each token further includes a history of previously recognised words, the method comprising the steps of:
-
i) combining tokens at each state of the network to form a set of tokens, the set including a main token having the highest likelihood and one or more relative tokens;
ii) converting the likelihood of each relative token into a relative likelihood that is set relative to the likelihood of the main token;
iii) for each set of tokens, merging tokens having the same history;
iv) transferring the set of tokens to subsequent nodes in the network;
v) updating the likelihood of at least the main token of each set of tokens; and
vi) repeating steps i) to v) at each respective node. - View Dependent Claims (6)
i) assigning an identifier to each set of tokens, the identifier representing the word histories of each of the tokens in the set of tokens;
ii) comparing the identifiers of different sets of tokens; and
iii) merging sets of tokens having the same identifiers.
-
-
7. A speech recognition system having a network of nodes comprising:
-
a set of first-phone nodes representing the first phones of words;
a tree-structured section representing within word phones, wherein within word phones includes any phone between the first phone and the last phone of a word; and
a number of null nodes for joining the set of first-phone nodes to the tree-structured section. - View Dependent Claims (8, 9, 10, 11)
a set of last-phone nodes representing the last phones words; and
additional null nodes for connecting the tree-section to the set of last-phone nodes.
-
-
9. The speech recognition system of claim 8 wherein the network of nodes further comprises:
-
a set of inter-word silence nodes; and
a set of null nodes for connecting nodes in the set of last-phone nodes to the inter-word silence nodes.
-
-
10. The speech recognition system of claim 9 wherein the network of nodes further comprises:
a set of null nodes for connecting the inter-word silence nodes to the nodes in the set of first-phone nodes.
-
11. The speech recognition system of claim 7 wherein each first-phone node represents a tri-phone.
Specification