Word hypothesizer for continuous speech decoding using stressed-vowel centered bidirectional tree searches
First Claim
1. A high-speed continuous speech-decoding system decoding a speech sentence by utilizing two-pass decoding, said system comprising:
- means for converting a speech utterance into a sequence of feature vectors;
means for detecting stressed vowel centers in said sequence of feature vectors;
means for generating a word lattice based on the detected stressed vowel centers; and
means for performing a time-synchronous Viterbi beam search using the word lattice from said generating means to constrain said Viterbi beam search.
1 Assignment
0 Petitions
Accused Products
Abstract
A word hypothesis module for speech decoding consists of four submodules: vowel center detection, bidirectional tree searches around each vowel center, forward-backward pruning, and additional short words hypotheses. By detecting the strong energy vowel centers, a vowel-centered lexicon tree can be placed at each vowel center and searches can be performed in both the left and right directions, where only simple phone models are used for fast acoustic match. A stage-wise forward-backward technique computes the word-beginning and word-ending likelihood scores over the generated half-word lattice for further pruning of the lattice. To avoid potential miss of short words with weak energy vowel centers, a lexicon tree is compiled for these words and tree searches are performed between each pair of adjacent vowel centers. The integration of the word hypothesizer with a top-down Viterbi beam search in continuous speech decoding provides two-pass decoding which significantly reduces computation time.
300 Citations
30 Claims
-
1. A high-speed continuous speech-decoding system decoding a speech sentence by utilizing two-pass decoding, said system comprising:
-
means for converting a speech utterance into a sequence of feature vectors; means for detecting stressed vowel centers in said sequence of feature vectors; means for generating a word lattice based on the detected stressed vowel centers; and means for performing a time-synchronous Viterbi beam search using the word lattice from said generating means to constrain said Viterbi beam search. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
-
-
16. In a high-speed continuous speech-decoding system for decoding speech sentences into decoded word strings, said system including means for converting a speech utterance into a sequence of feature vectors, the improvement therein comprising:
-
means for detecting stressed vowel centers in said sequence of feature vectors; and means for generating a word lattice based on the detected stressed vowel centers. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29)
-
-
30. An improved high-speed continuous speech-decoding system for decoding speech sentences into decoded word strings, said system including means for converting a speech utterance into feature vectors, a word hypothesizer for generating a word lattice, and means for performing a Viterbi beam search, the improvement therein comprising:
said word hypothesizer including, means for representing a lexicon having a plurality of lexicon entries containing primary stressed vowels for said speech-decoding system in a vowel-centered tree which is rooted in the primary stressed-vowels of said lexicon entries.
Specification