Unvoiced/voiced decision for speech processing

US 10,347,275 B2
Filed: 07/19/2018
Issued: 07/09/2019
Est. Priority Date: 09/09/2013
Status: Active Grant

First Claim

Patent Images

1. A method for speech processing, the method comprising:

determining by a processor, a first unvoicing parameter for a first subframe of a speech signal, wherein the first unvoicing parameter is determined using a product of (1−

P_voicing) and (1−

P_tilt), wherein P_voicingis a periodicity parameter and P_tiltis a spectral tilt parameter;

determining by a processor a smoothed first unvoicing parameter for the first subframe according to a smoothed second unvoicing parameter for a second subframe prior to the first subframe of the speech signal;

computing a difference between the first unvoicing parameter for the first subframe and the smoothed first unvoicing parameter for the first subframe;

determining a classification of the first subframe using the computed difference as a decision parameter, the classification indicating whether the first subframe is an unvoiced speech signal or not an unvoiced speech signal; and

performing bandwidth extension on the speech signal for the first subframe, wherein a parameter for performing the bandwidth extension when the classification indicates the first subframe is an unvoiced speech signal is different from a parameter for performing the bandwidth extension when the classification indicates the first subframe is not an unvoiced speech signal.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for speech processing includes determining a first unvoicing parameter for a first subframe of a speech signal, and determining a smoothed unvoicing parameter for the first subframe according to a second unvoicing parameter of a second subframe prior to the first subframe of the speech signal. The first unvoicing parameter is determined according to a periodicity parameter and a spectral tilt parameter. The method further includes computing a difference between the first unvoicing parameter for the first subframe and the smoothed unvoicing parameter for the first subframe and determining a classification of the first subframe using the computed difference as a decision parameter. The classification indicates whether the first subframe is an unvoiced speech signal or not an unvoiced speech signal. Bandwidth extension is performed on the speech signal for the first subframe according to the classification of the first subframe.

Citations

18 Claims

1. A method for speech processing, the method comprising:
- determining by a processor, a first unvoicing parameter for a first subframe of a speech signal, wherein the first unvoicing parameter is determined using a product of (1−
  
  P_voicing) and (1−
  
  P_tilt), wherein P_voicingis a periodicity parameter and P_tiltis a spectral tilt parameter;
  
  determining by a processor a smoothed first unvoicing parameter for the first subframe according to a smoothed second unvoicing parameter for a second subframe prior to the first subframe of the speech signal;
  
  computing a difference between the first unvoicing parameter for the first subframe and the smoothed first unvoicing parameter for the first subframe;
  
  determining a classification of the first subframe using the computed difference as a decision parameter, the classification indicating whether the first subframe is an unvoiced speech signal or not an unvoiced speech signal; and
  
  performing bandwidth extension on the speech signal for the first subframe, wherein a parameter for performing the bandwidth extension when the classification indicates the first subframe is an unvoiced speech signal is different from a parameter for performing the bandwidth extension when the classification indicates the first subframe is not an unvoiced speech signal.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method of claim 1, wherein the second subframe is adjacent to the first subframe.
  - 3. The method of claim 1, wherein determining the classification of the first subframe comprises determining a classification of the first subframe by comparing the computed difference with a threshold.
  - 4. The method of claim 1,wherein when the computed difference is greater than 0.1, the first subframe is classified as an unvoiced speech signal;
    - wherein when the computed difference is less than 0.05, the first subframe is classified as not an unvoiced speech signal; and
      
      wherein when the computed difference is not less than 0.05 and not greater than 0.1, the classification of the first subframe is the same as the second subframe.
  - 5. The method of claim 1, wherein the smoothed first unvoicing parameter for the first subframe is a weighted sum of the first unvoicing parameter for the first subframe and the smoothed second unvoicing parameter for the second subframe.
  - 6. The method of claim 5,wherein a weighting factor of the smoothed second unvoicing parameter for the second subframe is 0.9, and a weighting factor of the first unvoicing parameter for the first subframe is 0.1 when the smoothed second unvoicing parameter for the second subframe is greater than the first unvoicing parameter for the first subframe;
    - andwherein the weighting factor of the smoothed second unvoicing parameter for the second subframe is 0.99, and the weighting factor of the first unvoicing parameter for the first subframe is 0.01 when the smoothed second unvoicing parameter for the second subframe is not greater than the first unvoicing parameter for the first subframe.
  - 7. The method of claim 1, wherein performing bandwidth extension comprises:
    - controlling an energy of the first subframe in accordance with the classification of the first subframe.

8. An audio access device comprising a network interface and a CODEC with a decoder, wherein the decoder receives an encoded audio signal via the network interface, and is configured to:
- determine a first unvoicing parameter for a first subframe of a speech signal, wherein the first unvoicing parameter is determined using a product of (1−
  
  P_voicing) and (1−
  
  P_tilt), wherein P_voicingis a periodicity parameter and P_tiltis a spectral tilt parameter;
  
  determine a smoothed first unvoicing parameter for the first subframe according to a smoothed second unvoicing parameter for a second subframe prior to the first subframe of the speech signal;
  
  compute a difference between the first unvoicing parameter for the first subframe and the smoothed first unvoicing parameter for the first subframe;
  
  determine a classification of the first subframe using the computed difference as a decision parameter, the classification indicates whether the first subframe is an unvoiced speech signal or not an unvoiced speech signal; and
  
  perform bandwidth extension on the speech signal, wherein a parameter for performing the bandwidth extension when the classification indicates the first subframe is an unvoiced speech signal is different from a parameter for performing the bandwidth extension when the classification indicates the first subframe is not an unvoiced speech signal.
- View Dependent Claims (9, 10)
- - 9. The audio access device of claim 8, wherein the decoder is a digital signal processor.
  - 10. The audio access device of claim 8, wherein the CODEC is implemented by software running on a processor.

11. A speech processing apparatus comprising:
- a processor; and
  
  a non-transitory computer-readable storage medium storing computer instructions, that when executed by the processor, cause the processor to;
  
  determine a first unvoicing parameter for a first subframe of a speech signal, wherein the first unvoicing parameter is determined using a product of (1−
  
  P_voicing) and (1−
  
  P_tilt), wherein P_voicingis a periodicity parameter and P_tiltis a spectral tilt parameter;
  
  determine a smoothed first unvoicing parameter for the first subframe according to a smoothed second unvoicing parameter for a second subframe prior to the first subframe of the speech signal;
  
  compute a difference between the first unvoicing parameter for the first subframe and the smoothed first unvoicing parameter for the first subframe;
  
  determine a classification of the first subframe using the computed difference as a decision parameter, the classification indicates whether the first subframe is an unvoiced speech signal or not an unvoiced speech signal; and
  
  perform bandwidth extension on the speech signal for the first subframe, wherein a parameter for performing the bandwidth extension when the classification indicates the first subframe is an unvoiced speech signal is different from a parameter for performing the bandwidth extension when the classification indicates the first subframe is not an unvoiced speech signal.
- View Dependent Claims (12, 13, 14, 15, 16)
- - 12. The apparatus of claim 11, wherein the second subframe is adjacent to the first subframe.
  - 13. The apparatus of claim 11,wherein when the computed difference is greater than 0.1, the first subframe is classified as an unvoiced speech signal;
    - wherein when the computed difference is less than 0.05, the first subframe is classified as not an unvoiced speech signal; and
      
      wherein when the computed difference is not less than 0.05 and not greater than 0.1, the classification of the first subframe is the same as the second subframe.
  - 14. The apparatus of claim 11, wherein the smoothed first unvoicing parameter for the first subframe is a weighted sum of the first unvoicing parameter for the first subframe and the smoothed second unvoicing parameter for the second subframe.
  - 15. The apparatus of claim 14,wherein a weighting factor of the smoothed second unvoicing parameter for the second subframe is 0.9, and a weighting factor of the first unvoicing parameter for the first subframe is 0.1 when the smoothed second unvoicing parameter for the second subframe is greater than the first unvoicing parameter for the first subframe;
    - andwherein the weighting factor of the smoothed second unvoicing parameter for the second subframe is 0.99, and the weighting factor of the first unvoicing parameter for the first subframe is 0.01 when the smoothed second unvoicing parameter for the second subframe is not greater than the first unvoicing parameter for the first subframe.
  - 16. The apparatus of claim 11, wherein the computer instructions, when executed by the processor, further cause the processor to:
    - control an energy of the first subframe in accordance with the classification of the first subframe.

17. A non-transitory computer readable storage medium storing instructions which, when executed by a processor, cause the processor to perform the steps of:
- determining a first unvoicing parameter for a first subframe of a speech signal, wherein the first unvoicing parameter is determined using a product of (1−
  
  P_voicing) and (1−
  
  P_tilt), according wherein P_voicingis a periodicity parameter and P_tiltis a spectral tilt parameter;
  
  determining a smoothed first unvoicing parameter for the first subframe according to a second smoothed unvoicing parameter for a second subframe prior to the first subframe of the speech signal;
  
  computing a difference between the first unvoicing parameter for the first subframe and the smoothed first unvoicing parameter for the first subframe;
  
  determining a classification of the first subframe using the computed difference as a decision parameter, the classification indicating whether the first subframe is an unvoiced speech signal or not an unvoiced speech signal; and
  
  performing bandwidth extension on the speech signal for the first subframe, wherein a parameter for performing the bandwidth extension when the classification indicates the first subframe is an unvoiced speech signal is different from a parameter for performing the bandwidth extension when the classification indicates the first subframe is not an unvoiced speech signal.
- View Dependent Claims (18)
- - 18. The computer readable storage medium of claim 17, wherein the smoothed first unvoicing parameter for the first subframe is a weighted sum of the first unvoicing parameter for the first subframe and the smoothed second unvoicing parameter for the second subframe.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Huawei Technologies Co., Ltd. (Huawei Investment & Holding Co., Ltd.)
Original Assignee
Huawei Technologies Co., Ltd. (Huawei Investment & Holding Co., Ltd.)
Inventors
Gao, Yang
Primary Examiner(s)
Desir, Pierre Louis
Assistant Examiner(s)
Kim, Jonathan C

Application Number

US16/040,225
Publication Number

US 20180322895A1
Time in Patent Office

355 Days
Field of Search

None
US Class Current
CPC Class Codes

G10L 19/22   Mode decision, i.e. based o...

G10L 25/78   Detection of presence or ab...

G10L 25/93   Discriminating between voic...

Unvoiced/voiced decision for speech processing

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

Citations

18 Claims

Specification

Solutions

Use Cases

Quick Links

Unvoiced/voiced decision for speech processing

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

18 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links