Voice detection apparatus

US 5,103,481 A
Filed: 04/10/1990
Issued: 04/07/1992
Est. Priority Date: 04/10/1989
Status: Expired due to Term

First Claim

Patent Images

1. A voice detection apparatus comprising:

signal power calculation means for receiving an input voice signal that comprises a plurality of frames and has voiced and silent intervals and for calculating a signal power of the input voice signal for each of the frames;

zero crossing counting means for counting a number of polarity inversions of the input voice signal for each of the frames;

adaptive prediction filter means for obtaining a prediction error signal of the input voice signal for each of the frames;

error signal power calculation means for calculating an error signal power of the prediction error signal for each of the frames;

power comparing means for comparing the signal power of the input voice signal and the error signal power of the prediction error signal and for obtaining a power ratio responsive to the comparing; and

discriminating means for discriminating the voiced and silent intervals based on the signal power, the counted number of polarity inversions and the power ratio,said discriminating means including;

first means for discriminating the voiced and silent intervals of the input voice signal based on the counted number of polarity inversions, andsecond means for determining an absolute value of a difference of the power ratios between the frames, and for discriminating whether a frame is a voiced interval or a silent interval depending on a comparison of the absolute value with a first threshold value and whether a previous frame is a voiced interval or a silent interval when the signal power of the input voice signal is less than a second threshold value.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Speech presence versus silence is decided by a discriminator which can use a certain combination of parameter values: signal power, prediction error power, prediction error power deviation, and zero crossings.

Citations

19 Claims

1. A voice detection apparatus comprising:
- signal power calculation means for receiving an input voice signal that comprises a plurality of frames and has voiced and silent intervals and for calculating a signal power of the input voice signal for each of the frames;
  
  zero crossing counting means for counting a number of polarity inversions of the input voice signal for each of the frames;
  
  adaptive prediction filter means for obtaining a prediction error signal of the input voice signal for each of the frames;
  
  error signal power calculation means for calculating an error signal power of the prediction error signal for each of the frames;
  
  power comparing means for comparing the signal power of the input voice signal and the error signal power of the prediction error signal and for obtaining a power ratio responsive to the comparing; and
  
  discriminating means for discriminating the voiced and silent intervals based on the signal power, the counted number of polarity inversions and the power ratio,said discriminating means including;
  
  first means for discriminating the voiced and silent intervals of the input voice signal based on the counted number of polarity inversions, andsecond means for determining an absolute value of a difference of the power ratios between the frames, and for discriminating whether a frame is a voiced interval or a silent interval depending on a comparison of the absolute value with a first threshold value and whether a previous frame is a voiced interval or a silent interval when the signal power of the input voice signal is less than a second threshold value.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The voice detection apparatus as claimed in claim 1, further comprising:
    - means for sampling the input voice signal, andwherein said signal power calculation means includes means for calculating the signal power of the input voice signal based on ##EQU2## where SP denotes the signal power, n denotes a number of the samples, X_i denotes sectioning the input voice signal at predetermined time intervals and N denotes a number of the frames obtained from the sectioning of the input voice signal at the predetermined time intervals.
  - 3. The voice detection apparatus as claimed in claim 1, wherein said error signal power calculation means includes means for calculating the signal power of the prediction error signal.
  - 4. The voice detection apparatus as claimed in claim 1, wherein said zero crossing counted means comprises:
    - high pass filter means for filtering the input voice signal and for providing a first output signal having a polarity;
      
      polarity detection means for detecting the polarity of the first output signal and for providing a second output signal;
      
      delay means for delaying the second output signal and for providing a third output signal;
      
      polarity inversion detection means for detecting a polarity inversion of the first output signal based on the second output signal and the third output signal, and for providing a fourth output signal; and
      
      counter means for counting a number of polarity inversion based on the fourth output signal, said counter being reset for every frame of the input voice signal.
  - 5. The voice detection apparatus as claimed in claim 1, wherein said adaptive prediction filter comprises a linear prediction filter.
  - 6. The voice detection apparatus a claimed in claim 5, which further comprises:
    - linear prediction analyzer means for obtaining a prediction coefficient for use by said linear prediction filter based on the input voice signal.
  - 7. The voice detection apparatus as claimed in claim 5, which further comprises:
    - linear prediction analyzer means for analyzing data of a previous frame to obtain a prediction coefficient based on the input voice signal.

8. A voice detection apparatus comprising:
- signal power calculation means for receiving an input voice signal that comprises a plurality of frames and has voiced and silent intervals and for calculating a signal power of the input voice signal for each of the frames;
  
  zero crossing counting means for counting a number of polarity inversions of the input voice signal for each of the frames;
  
  prediction gain deviation calculation means for calculating a prediction gain and a prediction gain deviation between frames based on the input voice signal and the signal power calculated in said signal power calculation means; and
  
  discriminating means for discriminating the voiced and the silent intervals based on the signal power, the counted number of polarity inversions and the prediction gain and the prediction gain deviation,said discriminating means including;
  
  first means for discriminating the voiced and silent intervals of the input voice signal based on when the signal power is greater than or equal to a first threshold value and the counted number of polarity inversions falls outside a predetermined range of a second threshold value, andsecond means for discriminating the voiced and silent intervals of the voice signal based on a comparison of the prediction gain deviation and a third threshold value when the signal power is less than the first threshold value and the counted number of polarity inversions falls within the predetermined range of the second threshold value.
- View Dependent Claims (9, 10, 11, 17, 18, 19)
- - 9. The voice detection apparatus as claimed in claim 8, wherein said second means includes means for detecting a frame as a voiced interval when the prediction gain deviation is greater than or equal to the third threshold value and a previous frame is a silent interval and when the prediction gain is less than the third threshold value and the previous frame is a voiced interval, and for detecting the present frame as a silent interval when the prediction gain deviation is greater than or equal to the third threshold value and the previous frame is a voiced interval and when the prediction gain is less than the third threshold value and the previous frame is a silent interval.
  - 10. The voice detection apparatus as claimed in claim 8, wherein said prediction gain deviation calculation means includes:
    - adaptive predictor means for calculating a prediction error for each of the frames.
  - 11. The voice detection apparatus as claimed in claim 10, wherein said prediction gain deviation calculation means includes means for calculating the prediction gain based on G=-10log₁₀ [Σ
    - E² /P], where G denotes the prediction gain, P denotes the signal power and E denotes the prediction error.
  - 17. The voice detection apparatus as claimed in claim 10, wherein said prediction gain deviation detection means comprises a linear prediction filter.
  - 18. The voice detection apparatus as claimed in claim 17, which further comprises:
    - linear prediction analyzer means for obtaining a prediction coefficient for use by said linear prediction filter based on the input voice signal.
  - 19. The voice detection apparatus as claimed in claim 17, which further comprises:
    - linear prediction analyzer means for analyzing data of the previous frame and for obtaining a prediction coefficient for use by said linear prediction filter based on the input voice signal.

12. A voice detection apparatus for detecting voiced and silent intervals of an input voice signal that comprises a plurality of frames and has voiced and silent intervals, said voice detection apparatus comprising:
- prediction gain detection means for receiving the input voice signal and for detecting a prediction gain for a frame of the input voice signal;
  
  prediction gain deviation detection means for receiving the input voice signal and for detecting a prediction gain deviation between frames; and
  
  discriminating means for performing a first comparison of the prediction gain with a first threshold value and a second comparison of the prediction gain deviation with a second threshold value and for discriminating whether one of the frames of the input voice signal is a voiced interval or a silent interval based on the first and second comparisons.
- View Dependent Claims (13, 14, 15, 16)
- - 13. The voice detection apparatus as claimed in claim 12, wherein said discriminating means includes:
    - means for discriminating whether or not the frame of the input voice signal is a voiced interval or a silent interval based on the prediction gain and when the frame is first discriminated as a silent interval using the prediction gain deviation.
  - 14. The voice detection apparatus as claimed in claim 12, wherein said discriminating means includes means for discriminating whether or not a frame of the input voice signal is a voiced interval or a silent interval based on the prediction gain deviation when the frame is first discriminated as a silent interval using the prediction gain.
  - 15. The voice detection apparatus as claimed in claim 12, wherein the input voice signal has a signal power, and the voice detection apparatus further comprises:
    - signal power calculation means for receiving the input voice signal and for calculating the signal power of the input voice signal;
      
      zero crossing means for receiving the input signal and for counting a number of polarity inversions of the input voice signal; and
      
      said discriminating means includes means for discriminating whether or not the frame is a voiced interval or a silent interval based on the signal power and the counted number of polarity inversions when the signal power and the counted number of polarity inversions is less than or equal to corresponding third and fourth threshold values.
  - 16. The voice detection apparatus as claimed in claim 15, wherein said discriminating means includes:
    - means for discriminating whether or not the frame is a voiced interval or a silent interval only when at least one of the signal power and the number of polarity inversions are greater than the corresponding third and fourth threshold values.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Fujitsu Limited
Original Assignee
Fujitsu Limited
Inventors
Tomita, Yoshihiro, Abiru, Kenichi, Iseda, Kohei, Unagami, Shigeyuki
Primary Examiner(s)
KEMENY, EMANUEL

Application Number

US07/507,658
Time in Patent Office

728 Days
Field of Search

381/41-47, 381/71, 381/94, 364/513.5
US Class Current

704/249
CPC Class Codes

G10L 25/78 Detection of presence or ab...

Voice detection apparatus

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

19 Claims

Specification

Solutions

Use Cases

Quick Links

Voice detection apparatus

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

19 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links