Method and apparatus for detecting voice activity in a speech signal

US 6,188,981 B1
Filed: 09/18/1998
Issued: 02/13/2001
Est. Priority Date: 09/18/1998
Status: Expired due to Term

First Claim

Patent Images

1. In a speech communication system, a method for generating a frame voicing decision, the steps of the method comprising:

extracting a set of parameters, including pitch gain and pitch lag, from an incoming speech signal, for each frame;

calculating a standard deviation of the pitch lag from the extracted parameters over a consecutive number of subframes;

calculating a long term average of the pitch gain from the extracted parameters; and

making a frame voicing decision according to the results of said calculation step.

View all claims

9 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method and apparatus for generating frame voicing decisions for an incoming speech signal having periods of active voice and non-active voice for a speech encoder in a speech communications system. A predetermined set of parameters is extracted from the incoming speech signal, including a pitch gain and a pitch lag. A frame voicing decision is made for each frame of the incoming speech signal according to values calculated from the extracted parameters. The predetermined set of parameters further includes a frame full band energy, and a set of spectral parameters called Line Spectral Frequencies (LSF).

42 Citations

View as Search Results

13 Claims

1. In a speech communication system, a method for generating a frame voicing decision, the steps of the method comprising:
- extracting a set of parameters, including pitch gain and pitch lag, from an incoming speech signal, for each frame;
  
  calculating a standard deviation of the pitch lag from the extracted parameters over a consecutive number of subframes;
  
  calculating a long term average of the pitch gain from the extracted parameters; and
  
  making a frame voicing decision according to the results of said calculation step.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method according to claim 1, wherein the extracted set of parameters further comprises a full band energy and line spectral frequencies (LSF).
  - 3. The method according to claim 2, further comprising the steps of:
4. The method according to claim 3, further comprising the steps of:
- calculating a spectral difference SD₁using a normalized Itakura-Saito measure;
  
  calculating a spectral difference SD₂using a mean square error method;
  
  calculating a spectral difference SD₃using a mean square error method; and
  
  calculating a long-term mean of SD₂.
5. The method according to claim 4, wherein the frame voicing decision is made based on the calculated values.
6. The method according to claim 5, further comprising the step of smoothing the frame voicing decision.
7. The method according to claim 6, further comprising the step of performing an initialization for a predetermined number of initial frames, such that the voicing decision is set to active voice or non-active voice.

8. A Voice Activity Detector (VAD) for making a voicing decision on an incoming speech signal frame, the VAD comprising:
- an extractor for extracting a set of parameters, including pitch gain and pitch lag, from the incoming speech signal for each frame;
  
  a calculator unit for calculating a standard deviation of the pitch lag from the extracted parameters over a consecutive number of subframes and a long term mean pitch gain from the extracted parameters; and
  
  a decision unit for making a frame voicing decision according to the results from the calculator unit.
- View Dependent Claims (9, 10, 11, 12, 13)
- - 9. The VAD according to claim 8, wherein the extractor also extracts the parameters full band energy and line spectral frequencies (LSF).
  - 10. The VAD according to claim 9, wherein the calculator unit further calculates:
11. The VAD according to claim 10, wherein the calculator unit further calculates:
- a spectral difference SD₁using a normalized Itakura-Saito measure;
  
  a spectral difference SD₂using a mean square error method;
  
  a spectral difference SD₃using a mean square error method; and
  
  a long-term mean of SD₂.
12. The VAD according to claim 11, wherein the decision unit makes a frame voicing decision according to the values calculated by the calculator unit.
13. The VAD according to claim 12, wherein the voicing decision is smoothed.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
HTC Corporation
Original Assignee
Conexant Systems Incorporated (Synaptics Incorporated)
Inventors
Benyassine, Adil, Shlomot, Eyal
Primary Examiner(s)
Smits, Talivaldis I.
Assistant Examiner(s)
Nolan, Daniel A.

Application Number

US09/156,416
Time in Patent Office

879 Days
Field of Search

704/233, 704/219, 704/246, 704/214, 704/240, 704/243, 704/231, 704/207, 709/247
US Class Current

704/233
CPC Class Codes

G10L 25/78 Detection of presence or ab...

Method and apparatus for detecting voice activity in a speech signal

First Claim

9 Assignments

0 Petitions

Accused Products

Abstract

42 Citations

13 Claims

Specification

Solutions

Use Cases

Quick Links

Method and apparatus for detecting voice activity in a speech signal

First Claim

9 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

42 Citations

13 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links