Speech detection using stochastic confidence measures on the frequency spectrum

US 6,327,564 B1
Filed: 03/05/1999
Issued: 12/04/2001
Est. Priority Date: 03/05/1999
Status: Expired due to Fees

First Claim

Patent Images

1. A method for detecting speech from an input speech signal, comprising the steps of:

sampling the input speech signal over a plurality of frames, each of the frames having a plurality of samples;

determining an energy content value, M(f), for each of a plurality of frequency bands in a first frame of the input speech signal;

normalizing each of the energy content values for the first frame with respect to energy content values from a non-speech part of the input speech signal;

determining a chi-square value for each of the normalized energy content values associated with the first frame; and

comparing the chi-square value to a threshold value, thereby determining if the first frame correlates to the non-speech part of the input speech signal.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An accurate and reliable method is provided for detecting speech from an input speech signal. A probabilistic approach is used to classify each frame of the speech signal as speech or non-speech. The speech detection method is based on a frequency spectrum extracted from each frame, such that the value for each frequency band is considered to be a random variable and each frame is considered to be an occurrence of these random variables. Using the frequency spectrums from a non-speech part of the speech signal, a known set of random variables is constructed. Next, each unknown frame is evaluated as to whether or not it belongs to this known set of random variables. To do so, a unique random variable (preferably a chi-square value) is formed from the set of random variables associated with the unknown frame. The unique variable is normalized with respect the known set of random variables and then classified as either speech or non-speech using the “Test of Hypothesis”. Thus, each frame that belongs to the known set of random variables is classified as non-speech and each frame that does not belong to the known set of random variables is classified as speech.

Citations

10 Claims

1. A method for detecting speech from an input speech signal, comprising the steps of:
- sampling the input speech signal over a plurality of frames, each of the frames having a plurality of samples;
  
  determining an energy content value, M(f), for each of a plurality of frequency bands in a first frame of the input speech signal;
  
  normalizing each of the energy content values for the first frame with respect to energy content values from a non-speech part of the input speech signal;
  
  determining a chi-square value for each of the normalized energy content values associated with the first frame; and
  
  comparing the chi-square value to a threshold value, thereby determining if the first frame correlates to the non-speech part of the input speech signal.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1 wherein the step of comparing the chi-square value further comprises using a predefined confidence interval to determine the threshold value.
  - 3. The method of claim 1 wherein the threshold value is provided by X_α
    - ={square root over (2)}erfinv(1−
      
      2α
      
      ).
  - 4. The method of claim 1 wherein the step of normalizing each of the energy content values further comprises the steps of:
5. The method of claim 4 wherein the step of normalizing each of the energy content values is according to $M_{Norm}$
- (n,f)=M
  
  (n,f)-μ
  
  N
  
  (f)σ
  
  N
  
  (f).
6. The method of claim 5 further comprises the step of using the first frame to verify the validity of the noise model.
7. The method of claim 6 wherein the step of using the unknown frame further comprises using an over-estimation measure according to $D = \sum$
- f
  
  MNorm
  
  (n,f).
8. The method of claim 1 further comprises the step of normalizing the chi-square value, X, for the unknown frame, prior to comparing the chi-square value to the threshold value, whereby the normalizing is according to $X_{Norm} = \frac{X - F}{\sqrt{2}}$
- F,where F is the degrees of freedom for the chi-square distribution.
9. The method of claim 1 further comprises the steps of:
- determining chi-square values for each of the frames associated with the non-speech part of the input speech signal;
  
  determining a mean value, μ
  
  _x, and a variance value, σ
  
  _x, for the chi-square values associated with the non-speech part of the input speech signal; and
  
  normalizing the chi-square value for the first frame using the mean value and the variance value of the chi-square values, prior to comparing the chi-square value of the first frame to the threshold value.
10. The method of claim 9 wherein the step of normalizing the chi-square value is according to $X_{Norm}$
- (n)=X
  
  (n)-μ
  
  xσ
  
  x.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Panasonic Corporation Of North America (Panasonic Holdings Corporation)
Original Assignee
Matsushita Electric Corporation Of America (Panasonic Holdings Corporation)
Inventors
Junqua, Jean-Claude, Gelin, Philippe
Primary Examiner(s)
Korzuch, William
Assistant Examiner(s)
Chawan, Vijay B

Application Number

US09/263,292
Time in Patent Office

1,005 Days
Field of Search

704/233, 704/225, 704/215, 704/256, 704/258, 704/226, 704/227, 704/228, 704/234, 704/240
US Class Current

704/233
CPC Class Codes

G10L 25/78 Detection of presence or ab...

Speech detection using stochastic confidence measures on the frequency spectrum

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

10 Claims

Specification

Solutions

Use Cases

Quick Links

Speech detection using stochastic confidence measures on the frequency spectrum

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

10 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links