Speech recognition system utilizing both a long-term strategic and a short-term strategic scoring operation in a transition network thereof
First Claim
Patent Images
1. A speech recognition system comprising:
- means for acoustically analyzing an input speech signal for obtaining feature parameters of the input speech signal;
means for extracting a plurality of candidate phonetic segments from the feature parameters of the input speech signal, the candidate phonetic segments including acoustic feature labels of a silence, buzz, unvoiced sound and power dips;
means for causing the candidate phonetic segments to be passed through transition networks constructed for respective words to be recognized to effect a word verification operation to obtain candidate words;
means for determining a work score Q(k) of the candidate word by using a following equation including a first score represented by a first term and a second score represented by a second term of the following equation;
##EQU7## where μ
denotes a weighing coefficient (0≦
μ
≦
1), Sl denotes a similarity of a phonetic segment in a frame, L denotes a number of phonetic segments, S1l is a similarity of a first-order phonetic segment in a frame in which the phonetic segment takes a maximum value, and Smaxl denotes a maximum similarity of a corresponding phonetic segment;
mean for selecting that candidate word having the largest candidate score Q(k) to be a result of speech recognition; and
outputting the selected candidate word as said result of speech recognition.
1 Assignment
0 Petitions
Accused Products
Abstract
A plurality of candidate phonetic segments extracted from the input speech signal are passed through transition networks prepared for the respective words so as to obtain a score by weighting/averaging the long-term strategic scores by taking consideration of statistic distribution of the similarities or distances of phonetic segments and the short-term strategic scores by taking consideration of the environment of the phonetic segments.
48 Citations
5 Claims
-
1. A speech recognition system comprising:
-
means for acoustically analyzing an input speech signal for obtaining feature parameters of the input speech signal; means for extracting a plurality of candidate phonetic segments from the feature parameters of the input speech signal, the candidate phonetic segments including acoustic feature labels of a silence, buzz, unvoiced sound and power dips; means for causing the candidate phonetic segments to be passed through transition networks constructed for respective words to be recognized to effect a word verification operation to obtain candidate words; means for determining a work score Q(k) of the candidate word by using a following equation including a first score represented by a first term and a second score represented by a second term of the following equation;
##EQU7## where μ
denotes a weighing coefficient (0≦
μ
≦
1), Sl denotes a similarity of a phonetic segment in a frame, L denotes a number of phonetic segments, S1l is a similarity of a first-order phonetic segment in a frame in which the phonetic segment takes a maximum value, and Smaxl denotes a maximum similarity of a corresponding phonetic segment;mean for selecting that candidate word having the largest candidate score Q(k) to be a result of speech recognition; and outputting the selected candidate word as said result of speech recognition. - View Dependent Claims (2, 3, 4, 5)
-
Specification