Method of speech detection

US 5,572,623 A
Filed: 10/21/1993
Issued: 11/05/1996
Est. Priority Date: 10/21/1992
Status: Expired due to Term

First Claim

Patent Images

1. A method of detecting speech in noisy signals, comprising the steps of:

sampling plural speech frames including plural noise frames, at least one voiced frame and additional plural noise frames after said at least one voiced frame;

identifying said at least one voiced frame;

identifying said plural noise frames preceding said at least one voiced frame;

constructing an autoregressive model of noise and a mean noise spectrum based on said plural noise frames preceding said at least one voiced frame;

bleaching said plural noise frames preceding said at least one voiced frame by using a rejector filter;

removing noise by spectral noise removal from said plural noise frames preceding said at least one voiced frame;

finding an actual start of speech in the bleached plural noise frames;

extracting acoustic vectors used by a voice recognition system from the plural noise-removed frames lying between the actual start of speech and a first of said at least one voiced frame;

removing noise from and parameterizing said at least one voiced frame;

finding an actual end of speech; and

removing noise and parameterizing frames lying between a last of said at least one voiced frame and the actual end of speech.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for detecting the start and end of speech from a noisy signal including the steps of:

detecting a voiced frame;

searching for noise frames preceding this voiced frame;

constructing an autoregressive model of the noise and a mean noise spectrum;

bleaching the flames preceding the voicing,

searching for the actual start of speech in the bleached frames;

removing the noise from the voiced frames and parameterizing them; and

searching for the actual end of speech.

Citations

11 Claims

1. A method of detecting speech in noisy signals, comprising the steps of:
- sampling plural speech frames including plural noise frames, at least one voiced frame and additional plural noise frames after said at least one voiced frame;
  
  identifying said at least one voiced frame;
  
  identifying said plural noise frames preceding said at least one voiced frame;
  
  constructing an autoregressive model of noise and a mean noise spectrum based on said plural noise frames preceding said at least one voiced frame;
  
  bleaching said plural noise frames preceding said at least one voiced frame by using a rejector filter;
  
  removing noise by spectral noise removal from said plural noise frames preceding said at least one voiced frame;
  
  finding an actual start of speech in the bleached plural noise frames;
  
  extracting acoustic vectors used by a voice recognition system from the plural noise-removed frames lying between the actual start of speech and a first of said at least one voiced frame;
  
  removing noise from and parameterizing said at least one voiced frame;
  
  finding an actual end of speech; and
  
  removing noise and parameterizing frames lying between a last of said at least one voiced frame and the actual end of speech.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method as claimed in claim 1, wherein the step of bleaching comprises:
    - using a rejector filter constructed in said constructing step.
  - 3. The method as claimed in claim 2, further comprising the steps of:
    - reinitializing processing parameters after the last of said at least one voiced frame has been parameterized.
  - 4. The method as claimed in claim 1, wherein the step of sampling comprises:
    - sampling frames of signals to be processed; and
      
      processing the detected frames by Fourier transforms, wherein, when two Fourier transforms are consecutive in time, the two Fourier transforms are calculated over three consecutive frames with an overlap of one frame.
  - 5. The method as claimed in claim 1, wherein the step of identifying said at least one voiced frame comprises:
    - calculating a pitch for each of the sampled plural speech frames; and
      
      determining, for each of the sampled plural speech frames, if a voicing is present in a frame based on the calculated value of a pitch corresponding to said each frame.
  - 6. The method as claimed in claim 5, wherein the step of identifying said at least one voiced frame comprises:
    - identifying said at least one voiced frame after having determined that at least three voiced frames are in series without a hole bigger than a maximum hole size.
  - 7. The method as claimed in claim 5, wherein the step of calculating the pitch of one of said sampled plural speech frames comprises:
    - calculating a correlation of a signal of said one frame with a delayed form of the signal of said one frame.
  - 8. The method as claimed in claim 1, further comprising the step of:
    - detecting unvoiced sounds by thresholding.
  - 9. The method as claimed in claim 1, further comprising the step of:
    - detecting unvoiced speech based on a distance between a vocal kernel and a fricative block, and a size of said fricative block.
  - 10. The method as claimed in claim 1, wherein the steps of removing noise from said plural noise frames preceding said at least one voiced frame comprises:
    - obtaining a mean noise spectrum of the plural noise frames preceding said at least one voiced frame by Wiener filtering; and
      
      removing noise based on the obtained mean noise spectrum.
  - 11. The method as claimed in claim 10, further comprising the step of:
    - applying a smooth correlogram to the mean noise spectrum.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sextant Avionique
Original Assignee
Sextant Avionique
Inventors
Pastor, Dominique
Primary Examiner(s)
MacDonald, Allen R.
Assistant Examiner(s)
ONKA, THOMAS

Application Number

US08/139,740
Time in Patent Office

1,111 Days
Field of Search

381/34, 381/47, 381/38, 381/43, 381/46, 381/122, 367/1, 342/15, 324/76.47, 324/76.55, 984/260, 395/2.42, 395/2.17, 395/2.34, 395/2.57, 395/2.62
US Class Current

704/233
CPC Class Codes

G10L 2025/932   Decision in previous or fol...

G10L 2025/937   Signal energy in various fr...

G10L 21/0216   characterised by the method...

G10L 25/27   characterised by the analys...

G10L 25/87   Detection of discrete point...

Method of speech detection

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

11 Claims

Specification

Solutions

Use Cases

Quick Links

Method of speech detection

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

11 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links