Method and apparatus to detect and delimit foreground speech

US 6,134,524 A
Filed: 10/24/1997
Issued: 10/17/2000
Est. Priority Date: 10/24/1997
Status: Expired due to Term

First Claim

Patent Images

1. A method for processing data in a voice recognition system capable of receiving foreground speech in the presence of background noise, comprising the steps, performed by a processor, ofextracting a channel signal;

generating a mask signal from the channel signal;

masking the extracted channel signal with the mask signal; and

taking a sample standard deviation of the masked channel signal over a temporal window; and

generating foreground speech endpoints using the sample standard deviation determined during said taking step.

View all claims

15 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The present invention provides improved foreground-speech signal endpointing by computing a spectral stationarity statistic. This statistic is used by a finite state machine to endpoint speech. Endpointing using the spectral stationarity statistic is less susceptible to background noise than endpointing using conventional measures. The present invention uses frame-synchronous quantile estimation to generate a mask signal for signal to Noise Ratio Normalization.

Citations

24 Claims

1. A method for processing data in a voice recognition system capable of receiving foreground speech in the presence of background noise, comprising the steps, performed by a processor, ofextracting a channel signal;
- generating a mask signal from the channel signal;
  
  masking the extracted channel signal with the mask signal; and
  
  taking a sample standard deviation of the masked channel signal over a temporal window; and
  
  generating foreground speech endpoints using the sample standard deviation determined during said taking step.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
- - 2. The method of claim 1, wherein the extracting step extracts a channel energy signal.
  - 3. The method of claim 2, further comprising the step of:
    - performing a background normalization on the sample standard deviation.
  - 4. The method of claim 3, wherein the step of performing background normalization comprises the substeps of:
    - filtering the masked channel energy signal to produce an estimated background signal; and
      
      subtracting the estimated background signal from the masked channel energy signal.
  - 5. The method of claim 4, wherein the step of filtering comprises the substeps of:
    - filtering the masked signal using a previous background estimator;
      
      filtering the masked signal using an advanced background estimator; and
      
      selecting the minimum of the filtered masked signals as the estimated background signal.
  - 6. The method of claim 2, wherein generating the mask signal includes the substeps of:
    - storing a previous mask signal; and
      
      generating the mask signal from the channel signal and the stored previous mask signal.
  - 7. The method of claim 2, further comprising the step of:
    - computing a high quantile estimation and a low quantile estimation.
  - 8. The method of claim 7, wherein the step of generating the mask signal includes the substep of:
    - equalizing the separations between the computed high quantile estimate and the extracted channel energy signal and between the computed low quantile estimate and the extracted channel energy signal.
  - 9. The method of claim 2, wherein the step of masking the extracted channel energy signal includes the substep of:
    - adding the generated mask signal to the extracted channel energy signal.
  - 10. The method of claim 2, further comprising the step of:
    - smoothing the masked channel energy signal.
  - 11. The method of claim 10, further comprising the step of:
    - taking a square root of the variance.
  - 12. The method of claim 2, wherein the step of taking the sample standard deviation comprises the substeps of:
    - storing a plurality of previously taken masked signal values in a buffer;
      
      replacing a least current of the plurality of masked signal values with the current masked signal value; and
      
      computing the sample variance between the plurality of masked signal values stored in the buffer.
  - 13. The method of claim 2, further comprising the step of:
    - transforming the extracted channel energy signal.
  - 14. The method of claim 13, wherein the transforming step includes taking a generalized logarithm (root) of the extracted channel energy signal.

15. An apparatus in a voice recognition system capable of receiving foreground speech in the presence of background noise, comprising:
- means for extracting a channel signal;
  
  means for generating a mask signal from the channel signal;
  
  means for masking the extracted channel signal using the generated mask signal; and
  
  means for taking a sample standard deviation of the masked channel signal over a temporal window, andmeans for generating foreground speech endpoints using the sample standard deviation determined by said means for taking.
- View Dependent Claims (16, 17, 18, 19, 20, 21)
- - 16. The apparatus of claim 15, wherein the extracting means extracts a channel energy signal.
  - 17. The apparatus of claim 15, further comprising:
    - means for performing a background normalization on the sample standard deviation.
  - 18. The apparatus of claim 15, further comprising:
    - a smoothing filter.
  - 19. The apparatus of claim 15, further comprising:
    - means for computing a high quantile estimate and a low quantile estimate.
  - 20. The apparatus of claim 15, further comprising:
    - means for generating a background estimate signal; and
      
      means for subtracting the background estimate signal from the sample standard deviation.
  - 21. The apparatus of claim 15, wherein the means for generating a background estimate signal comprises:
    - a previous background estimator;
      
      an advance background estimator; and
      
      a minimizer to output the minimum of the previous background estimator and the advance background estimator as the background estimate signal.

22. A computer program product comprising:
- a computer usable medium having computer readable code embodied therein for processing data in a voice recognition system, the computer usable medium comprisingan extracting module configured to extract a channel energy signal;
  
  a mask generating module configured to generate a mask signal from the channel energy signal;
  
  a masking module configured to mask the extracted channel energy signal with the generated mask signal; and
  
  a standard deviation module configured to take a sample standard deviation of the masked extracted channel energy signal over a temporal window, andan end point generating module configured to generate foreground speech endpoints using the sample standard deviation determined by said standard deviation module.
- View Dependent Claims (23, 24)
- - 23. The computer program product of claim 22, further comprising:
    - a background normalization module configured to perform background normalization on the sample standard deviation.
  - 24. The computer program product of claim 22, further comprising:
    - a computing module configured to compute a high quantile estimation and a low quantile estimation.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Avaya Incorporated
Original Assignee
Nortel Networks Corporation
Inventors
Boies, Daniel, Peters, Stephen Douglas
Primary Examiner(s)
ARMSTRONG, ANGELA A

Application Number

US08/950,417
Time in Patent Office

1,089 Days
Field of Search

704/233, 704/200, 704/201, 704/248, 704/253, 704/231, 704/226, 704/227, 704/228
US Class Current

704/233
CPC Class Codes

G10L 25/87 Detection of discrete point...

Method and apparatus to detect and delimit foreground speech

First Claim

15 Assignments

0 Petitions

Accused Products

Abstract

Citations

24 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus to detect and delimit foreground speech

First Claim

15 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

24 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links