Speech detecting device and speech detecting method

US 6,490,554 B2
Filed: 03/28/2002
Issued: 12/03/2002
Est. Priority Date: 11/24/1999
Status: Expired due to Term

First Claim

Patent Images

1. A voice activity detecting device comprising:

a speech-segment inferring section for determining, for each of active voice frames as an aural signal given in order of time sequence, a probability that the active voice frame belongs to an active voice segment, the determining being made based on a statistical characteristic of the aural signal;

a quality monitoring section for monitoring quality of the aural signal for each of the active voice frames; and

a speech-segment determining section for determining, for each of the active voice frames as an aural signal given in order of time sequence, an accuracy that the active voice frame belongs to an active voice segment by weighting the probability determined by said speech-segment inferring section with the quality monitored by said quality monitoring section.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

The invention relates to a voice activity detecting device and a voice activity detecting method. An object of the invention is to adapt to various characteristics of noise which may possibly be superimposed on an aural signal to thereby reliably discriminate between an active voice segment and a non-active voice segment. For this purpose, the voice activity detecting device comprises: a speech-segment inferring section 11 for determining the probability that each of active voice frames given in order of time sequence belongs to the active voice segment, based on the statistical characteristic of the aural signal; a quality monitoring section 12 for monitoring the quality of the aural signal for each active voice frame, and a speech-segment determining section 13 for weighting the determined probability with the above quality to obtain for each active voice frame the accuracy that the active voice frame belongs to the active voice segment.

Citations

33 Claims

1. A voice activity detecting device comprising:
- a speech-segment inferring section for determining, for each of active voice frames as an aural signal given in order of time sequence, a probability that the active voice frame belongs to an active voice segment, the determining being made based on a statistical characteristic of the aural signal;
  
  a quality monitoring section for monitoring quality of the aural signal for each of the active voice frames; and
  
  a speech-segment determining section for determining, for each of the active voice frames as an aural signal given in order of time sequence, an accuracy that the active voice frame belongs to an active voice segment by weighting the probability determined by said speech-segment inferring section with the quality monitored by said quality monitoring section.
- View Dependent Claims (4, 7, 10, 13, 16, 19, 22, 25, 28)
- - 4. The voice activity detecting device according to claim 1, wherein
- 7. The voice activity detecting device according to claim 1, whereinsaid quality monitoring section determines assessed noise-power for each of the active voice frames to obtain the quality of the aural signal as a monotone nonincreasing function of the assessed noise-power.
- 10. The voice activity detecting device according to claim 1, whereinsaid quality monitoring section determines, for each of the active voice frames, assessed noise-power and an assessed value of an SN ratio to obtain the quality of the aural signal as a monotone nonincreasing function and a monotone nondecreasing function, respectively.
- 13. The voice activity detecting device according to claim 1, whereinsaid quality monitoring section determines a standardized random variable for each of the active voice frames to obtain the quality of the aural signal as a monotone decreasing function of the standardized random variable.
- 16. The voice activity detecting device according to claim 1, whereinsaid quality monitoring section determines, for each of the active voice frames, a standardized random variable and an assessed value of an SN ratio to obtain the quality of the aural signal as a monotone nonincreasing function and a monotone nondecreasing function, respectively.
- 19. The voice activity detecting device according to claim 7, whereinsaid quality monitoring section determines a peak value of instantaneous values of the aural signal contained in each of the active voice frames;
  - and calculates amplitude normalized by a standard deviation of the probability density function by applying, to a probability density function approximating to amplitude distribution of the aural signal, the number of the instantaneous values and a probability at which the peak value appears; and
    
    determines a standardized random variable as a ratio of the amplitude to the peak value.
- 22. The voice activity detecting device according to claim 10, whereinsaid quality monitoring section determines a peak value of instantaneous values of the aural signal contained in each of the active voice frames;
  - and calculates amplitude normalized by a standard deviation of the probability density function by applying, to a probability density function approximating to amplitude distribution of the aural signal, the number of the instantaneous values and a probability at which the peak value appears; and
    
    determines a standardized random variable as a ratio of the amplitude to the peak value.
- 25. The voice activity detecting device according to claim 1, whereinsaid quality monitoring section integrates the monitored quality of the aural signal in sequence to apply the resultant as normal quality.
- 28. The voice activity detecting device according to claim 1, whereinsaid quality monitoring section integrates the monitored quality of the aural signal in sequence to apply as quality a value which is obtained as a monotone increasing function or a monotone nondecreasing function of the resultant.

2. A voice activity detecting device comprising:
- a speech-segment determining section for determining, for each of active voice frames as an aural signal given in order of time sequence, an accuracy that the active voice frame belongs to an active voice segment, the determining being made based on a statistical characteristic of the aural signal; and
  
  a quality monitoring section for monitoring quality of the aural signal for each of the active voice frames, and wherein said speech-segment determining section weights a sequence of instantaneous values of the aural signal contained in each of the active voice frames by a weighting given as a monotone decreasing function or a monotone nonincreasing function of the quality monitored by said quality monitoring section.
- View Dependent Claims (5, 8, 11, 14, 17, 20, 23, 26, 29)
- - 5. The voice activity detecting device according to claim 2, wherein
- 8. The voice activity detecting device according to claim 2, whereinsaid quality monitoring section determines assessed noise-power for each of the active voice frames to obtain the quality of the aural signal as a monotone nonincreasing function of the assessed noise-power.
- 11. The voice activity detecting device according to claim 2, whereinsaid quality monitoring section determines, for each of the active voice frames, assessed noise-power and an assessed value of an SN ratio to obtain the quality of the aural signal as a monotone nonincreasing function and a monotone nondecreasing function, respectively.
- 14. The voice activity detecting device according to claim 2, whereinsaid quality monitoring section determines a standardized random variable for each of the active voice frames to obtain the quality of the aural signal as a monotone decreasing function of the standardized random variable.
- 17. The voice activity detecting device according to claim 2, whereinsaid quality monitoring section determines, for each of the active voice frames, a standardized random variable and an assessed value of an SN ratio to obtain the quality of the aural signal as a monotone nonincreasing function and a monotone nondecreasing function, respectively.
- 20. The voice activity detecting device according to claim 8, whereinsaid quality monitoring section determines a peak value of instantaneous values of the aural signal contained in each of the active voice frames;
  - and calculates amplitude normalized by a standard deviation of the probability density function by applying, to a probability density function approximating to amplitude distribution of the aural signal, the number of the instantaneous values and a probability at which the peak value appears; and
    
    determines a standardized random variable as a ratio of the amplitude to the peak value.
- 23. The voice activity detecting device according to claim 11 , whereinsaid quality monitoring section determines a peak value of instantaneous values of the aural signal contained in each of the active voice frames;
  - and calculates amplitude normalized by a standard deviation of the probability density function by applying, to a probability density function approximating to amplitude distribution of the aural signal, the number of the instantaneous values and a probability at which the peak value appears; and
    
    determines a standardized random variable as a ratio of the amplitude to the peak value.
- 26. The voice activity detecting device according to claim 2, whereinsaid quality monitoring section integrates the monitored quality of the aural signal in sequence to apply the resultant as normal quality.
- 29. The voice activity detecting device according to claim 2, whereinsaid quality monitoring section integrates the monitored quality of the aural signal in sequence to apply as quality avalue which is obtained as a monotone increasing function or a monotone nondecreasing function of the resultant.

3. A voice activity detecting device comprising:
- a speech-segment determining section for determining an accuracy that individual active voice frames belong to an active voice segment by performing companding processing for each of the active voice frames given in order of time sequence and by analyzing, based on a statistical characteristic of an aural signal, a sequence of instantaneous values of the aural signal obtained in the companding processing; and
  
  a quality monitoring section for monitoring quality of the aural signal for each of the active voice frames, and wherein said speech-segment determining section applies a companding characteristic to the companding processing for each of the active voice frames, the companding characteristic being given as a monotone decreasing function of the quality monitored by said quality monitoring section.
- View Dependent Claims (6, 9, 12, 15, 18, 21, 24, 27, 30)
- - 6. The voice activity detecting device according to claim 3, wherein
- 9. The voice activity detecting device according to claim 3, whereinsaid quality monitoring section determines assessed noise-power for each of the active voice frames to obtain the quality of the aural signal as a monotone nonincreasing function of the assessed noise-power.
- 12. The voice activity detecting device according to claim 3, whereinsaid quality monitoring section determines, for each of the active voice frames, assessed noise-power and an assessed value of an SN ratio to obtain the quality of the aural signal as a monotone nonincreasing function and a monotone nondecreasing function, respectively.
- 15. The voice activity detecting device according to claim 3, whereinsaid quality monitoring section determines a standardized random variable for each of the active voice frames to obtain the quality of the aural signal as a monotone decreasing function of the standardized random variable.
- 18. The voice activity detecting device according to claim 3, whereinsaid quality monitoring section determines, for each of the active voice frames, a standardized random variable and an assessed value of an SN ratio to obtain the quality of the aural signal as a monotone nonincreasing function and a monotone nondecreasing function, respectively.
- 21. The voice activity detecting device according to claim 9, whereinsaid quality monitoring section determines a peak value of instantaneous values of the aural signal contained in each of the active voice frames;
  - and calculates amplitude normalized by a standard deviation of the probability density function by applying, to a probability density function approximating to amplitude distribution of the aural signal, the number of the instantaneous values and a probability at which the peak value appears; and
    
    determines a standardized random variable as a ratio of the amplitude to the peak value.
- 24. The voice activity detecting device according to claim 12, whereinsaid quality monitoring section determines a peak value of instantaneous values of the aural signal contained in each of the active voice frames;
  - and calculates amplitude normalized by a standard deviation of the probability density function by applying, to a probability density function approximating to amplitude distribution of the aural signal, the number of the instantaneous values and a probability at which the peak value appears; and
    
    determines a standardized random variable as a ratio of the amplitude to the peak value.
- 27. The voice activity detecting device according to claim 3, whereinsaid quality monitoring section integrates the monitored quality of the aural signal in sequence to apply the resultant as normal quality.
- 30. The voice activity detecting device according to claim 3, whereinsaid quality monitoring section integrates the monitored quality of the aural signal in sequence to apply as quality a value which is obtained as a monotone increasing function or a monotone nondecreasing function of the resultant.

31. A voice activity detecting method comprising the steps of:
- determining, for each of active voice frames as an aural signal given in order of time sequence, a probability that the active voice frame belongs to an active voice segment, the determining being made based on a statistical characteristic of the aural signal;
  
  monitoring quality of the aural signal for each of the active voice frames; and
  
  determining, for each of the active voice frames as an aural signal given in order of time sequence, an accuracy that the active voice frame belongs to an active voice segment by weighting the determined probability with the monitored quality.

32. A voice activity detecting method comprising the steps of:
- determining, for each of the active voice frames as an aural signal given in order of time sequence, an accuracy that the active voice frame belongs to an active voice segment, the determining being made based on a statistical characteristic of the aural signal;
  
  monitoring quality of the aural signals for each of the active voice frames; and
  
  weighting a sequence of instantaneous values of the aural signal contained in each of the active voice frames. by a weighting given as a monotone decreasing function or a monotone nonincreasing function of the monitored quality.

33. A voice activity detecting method comprising the steps of:
- determining an accuracy that individual active voice frames belong to an active voice segment by performing companding processing for each of the active voice frames as an aural signal given in order of time sequence and by analyzing a sequence of instantaneous values of an aural signal obtained in the companding processing, the determining being made based on a statistical characteristic of the aural signal;
  
  monitoring quality of the aural signal for each of the active voice frames; and
  
  applying a companding characteristic to the companding processing for each of the active voice frames, the companding characteristic being given as a monotone decreasing function of the monitored quality.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Fujitsu Connected Technologies Limited
Original Assignee
Fujitsu Limited
Inventors
Ota, Yasuji, Endo, Kaori
Primary Examiner(s)
Banks-Harold, Marsha D.
Assistant Examiner(s)
Lerner, Martin

Application Number

US10/112,470
Publication Number

US 20020138255A1
Time in Patent Office

250 Days
Field of Search

704/206, 704/208, 704/210, 704/213, 704/214, 704/215, 704/226, 704/228
US Class Current

704/215
CPC Class Codes

G10L 25/69 for evaluating synthetic or...

G10L 25/78 Detection of presence or ab...

Speech detecting device and speech detecting method

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

33 Claims

Specification

Solutions

Use Cases

Quick Links

Speech detecting device and speech detecting method

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

33 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links