Start/end point detection for word recognition
First Claim
1. A method of detecting start and end points of words in a signal indicative of speech, with a detected start point indicating a beginning of a word and, at a same time, an end of a nonspeech interval, and a detected end point indicating an end of the word and, at a same time, a beginning of the nonspeech interval, comprising the steps of:
- dividing the signal indicative of speech into blocks,forming a current feature vector from at least two current features, a first of which is a function of a signal energy, and an at least second of which is a function of a quadratic difference between a linear predictive coding (LPC) cepstrum coefficient of a current block and an average LPC cepstrum coefficient,determining an average feature vector from a predefined number I of blocks containing a nonspeech interval, and updating said average feature vector on an occurrence of each new nonspeech interval, andusing the current feature vector and the average feature vector to determine a check quantity (U) which, compared with a threshold value, provides information as to whether a nonspeech interval or word is present, thus detecting the start and end points.
0 Assignments
0 Petitions
Accused Products
Abstract
During speech recognition of words, a precise and strong detection of start/end points of the words must be ensured, even in very noisy surroundings. Use of a feature with noise-resistant properties is shown wherein for a feature vector, a function of the signal energy is formed as the first feature and a function of the quadratic difference of an LPC (Linear-Predictive-Coding) cepstrum coefficient as a second feature. A check quantity or a maximum function of a distribution function is calculated, which detects the start/end points by comparison with a threshold.
62 Citations
15 Claims
-
1. A method of detecting start and end points of words in a signal indicative of speech, with a detected start point indicating a beginning of a word and, at a same time, an end of a nonspeech interval, and a detected end point indicating an end of the word and, at a same time, a beginning of the nonspeech interval, comprising the steps of:
-
dividing the signal indicative of speech into blocks, forming a current feature vector from at least two current features, a first of which is a function of a signal energy, and an at least second of which is a function of a quadratic difference between a linear predictive coding (LPC) cepstrum coefficient of a current block and an average LPC cepstrum coefficient, determining an average feature vector from a predefined number I of blocks containing a nonspeech interval, and updating said average feature vector on an occurrence of each new nonspeech interval, and using the current feature vector and the average feature vector to determine a check quantity (U) which, compared with a threshold value, provides information as to whether a nonspeech interval or word is present, thus detecting the start and end points. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method of detecting start and end points of words in a signal indicative of speech, comprising the steps of:
-
dividing the signal indicative of speech into blocks, forming a current feature vector from at least two current features, a first of which is a function of signal energy, and an at least second of which is a function of a linear predictive coding (LPC) cepstrum coefficient, determining distribution functions by means of the functions of the current features, and determining for each block a maximum one of said distribution functions and comparing said maximum one of said distribution functions to a threshold as a measure of whether a nonspeech interval or word occurs between the detected start and end points. - View Dependent Claims (13, 14)
-
-
15. Program module for detecting the start/end points of words in a signal indicative of speech, comprising:
-
input/output (I/O) means, responsive to the signal indicative of speech for providing said signal indicative of speech; and a signal processor, responsive to said signal indicative of speech from said I/O means, for forming a current feature vector for detecting both a start point and an end point, and for forming at least a second feature with noise-resistant properties for said feature vector in which the current feature vector, an average feature vector and a check quantity (U) are formed for detecting both said start point and said end point and for forming a start/end point signal wherein said I/O means is responsive to said start/end point signal for providing said start/end point signal as an output of said program module.
-
Specification