METHOD AND APPARATUS FOR RECOGNIZING SPEECH
First Claim
1. A method of recognizing speech, comprising:
- extracting frame speech feature vectors from a speech signal;
performing speech recognition on frames of the speech signal using the frame speech feature vectors and a frame-based probability model;
dividing the speech signal into segments each of which is longer than each of the frames in terms of time;
extracting segment speech feature vectors around a boundary between the segments;
performing speech recognition on the segments of the speech signal using the segment speech feature vectors and a segment-based probability model; and
combining results of the speech recognition for the frames with results of the speech recognition for the segments.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed herein are an apparatus and method for recognizing speech. The apparatus includes a frame-based speech recognition unit, a segment division unit, a segment feature extraction unit, a segment speech recognition performance unit, and a combination and synchronization unit. The frame-based speech recognition unit extracts frame speech feature vectors from a speech signal, and performs speech recognition on frames of the speech signal using the frame speech feature vectors and a frame-based probability model. The segment division unit divides the speech signal into segments. The segment feature extraction unit extracts segment speech feature vectors around a boundary between the segments. The segment speech recognition performance unit performs speech recognition on the segments of the speech signal using the segment speech feature vectors and a segment-based probability model. The combination and synchronization unit combines results of the speech recognition for the frames with results of the speech recognition for the segments.
26 Citations
16 Claims
-
1. A method of recognizing speech, comprising:
-
extracting frame speech feature vectors from a speech signal; performing speech recognition on frames of the speech signal using the frame speech feature vectors and a frame-based probability model; dividing the speech signal into segments each of which is longer than each of the frames in terms of time; extracting segment speech feature vectors around a boundary between the segments; performing speech recognition on the segments of the speech signal using the segment speech feature vectors and a segment-based probability model; and combining results of the speech recognition for the frames with results of the speech recognition for the segments. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. An apparatus for recognizing speech, comprising:
-
a frame-based speech recognition unit for extracting frame speech feature vectors from a speech signal, and performing speech recognition on frames of the speech signal using the frame speech feature vectors and a frame-based probability model; a segment division unit for dividing the speech signal into segments each of which is longer than each of the frames in terms of time; a segment feature extraction unit for extracting segment speech feature vectors around a boundary between the segments; a segment speech recognition performance unit for performing speech recognition on the segments of the speech signal using the segment speech feature vectors and a segment-based probability model; and a combination and synchronization unit for combining results of the speech recognition obtained by the frame-based speech recognition unit with results of the speech recognition obtained by the segment speech recognition performance unit . - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
Specification