METHOD AND APPARATUS FOR DETECTING SPEECH ENDPOINT USING WEIGHTED FINITE STATE TRANSDUCER
First Claim
1. An apparatus for detecting a speech endpoint, comprising:
- a speech decision portion configured to receive frame units of feature vector converted from a speech signal and to analyze and classify the received feature vector into a speech class or a noise class;
a frame level WFST configured to receive the speech class and the noise class and to convert the speech class and the noise class to a WFST format;
a speech level WFST configured to detect a speech endpoint by analyzing a relationship between the speech class and noise class and a preset state;
a WFST combination portion configured to combine the frame level WFST with the speech level WFST; and
an optimization portion configured to optimize the combined WFST having the frame level WFST and the speech level WFST combined therein to have a minimum route.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed are an apparatus and a method for detecting a speech endpoint using a WFST. The apparatus in accordance with an embodiment of the present invention includes: a speech decision portion configured to receive frame units of feature vector converted from a speech signal and to analyze and classify the received feature vector into a speech class or a noise class; a frame level WFST configured to receive the speech class and the noise class and to convert the speech class and the noise class to a WFST format; a speech level WFST configured to detect a speech endpoint by analyzing a relationship between the speech class and noise class and a preset state; a WFST combination portion configured to combine the frame level WFST with the speech level WFST; and an optimization portion configured to optimize the combined WFST having the frame level WFST and the speech level WFST combined therein to have a minimum route.
41 Citations
12 Claims
-
1. An apparatus for detecting a speech endpoint, comprising:
-
a speech decision portion configured to receive frame units of feature vector converted from a speech signal and to analyze and classify the received feature vector into a speech class or a noise class; a frame level WFST configured to receive the speech class and the noise class and to convert the speech class and the noise class to a WFST format; a speech level WFST configured to detect a speech endpoint by analyzing a relationship between the speech class and noise class and a preset state; a WFST combination portion configured to combine the frame level WFST with the speech level WFST; and an optimization portion configured to optimize the combined WFST having the frame level WFST and the speech level WFST combined therein to have a minimum route. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A method for detecting a speech endpoint by receiving frame units of feature vector converted from a speech signal and detecting a speech endpoint by use of an apparatus for detecting a speech endpoint, the apparatus for detecting a speech endpoint executing:
-
analyzing and classifying the feature vector into a speech class and a noise class; creating a frame level WFST by converting the speech class and the noise class to a WFST format after receiving the speech class and the noise class; creating a speech level WFST detecting a speech endpoint by analyzing a relationship between the speech class and noise class and a preset state; obtaining a combined WFST by combining the frame level WFST with the speech level WFST; and optimizing the combined WFST. - View Dependent Claims (9, 10, 11, 12)
-
Specification