Unvoiced/voiced decision for speech processing

US 10,043,539 B2
Filed: 12/27/2016
Issued: 08/07/2018
Est. Priority Date: 09/09/2013
Status: Active Grant

First Claim

Patent Images

1. A method for speech processing, the method comprising:

determining, by a processor, an unvoicing parameter for a first frame of a speech signal, wherein the unvoicing parameter reflects a speech characteristic of the first frame;

determining, by a processor, a smoothed unvoicing parameter for the first frame by weighting the unvoicing parameter for the first frame and a smoothed unvoicing parameter for a second frame, when the smoothed unvoicing parameter for the second frame is greater than the unvoicing parameter for the first frame, the smoothed unvoicing parameter for the second frame is weighted less heavily than the case when the smoothed unvoicing parameter for the second frame is not greater than the unvoicing parameter for the first frame;

computing a difference, by the processor, between the unvoicing parameter for the first frame and the smoothed unvoicing parameter for the first frame;

determining a classification of the first frame according to the computed difference, wherein the classification indicates whether the first frame is an unvoiced speech signal or not;

processing the first frame by the processor in accordance with the classification of the first frame; and

outputting a synthesized speech signal according to the processing of the first frame.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A method for speech processing includes determining an unvoicing parameter for a first frame of a speech signal and determining a smoothed unvoicing parameter for the first frame by weighting the unvoicing parameter of the first frame and a smoothed unvoicing parameter of a second frame. The unvoicing parameter reflects a speech characteristic of the first frame. The smoothed unvoicing parameter of the second frame is weighted less heavily when the smoothed unvoicing parameter of the second frame is greater than the unvoicing parameter of the first frame. The method further includes computing a difference, by a processor, between the unvoicing parameter of the first frame and the smoothed unvoicing parameter of the first frame, and determining a classification of the first frame according to the computed difference. The classification includes unvoiced speech or voiced speech. The first frame is processed in accordance with the classification of the first frame.

Citations

21 Claims

1. A method for speech processing, the method comprising:
- determining, by a processor, an unvoicing parameter for a first frame of a speech signal, wherein the unvoicing parameter reflects a speech characteristic of the first frame;
  
  determining, by a processor, a smoothed unvoicing parameter for the first frame by weighting the unvoicing parameter for the first frame and a smoothed unvoicing parameter for a second frame, when the smoothed unvoicing parameter for the second frame is greater than the unvoicing parameter for the first frame, the smoothed unvoicing parameter for the second frame is weighted less heavily than the case when the smoothed unvoicing parameter for the second frame is not greater than the unvoicing parameter for the first frame;
  
  computing a difference, by the processor, between the unvoicing parameter for the first frame and the smoothed unvoicing parameter for the first frame;
  
  determining a classification of the first frame according to the computed difference, wherein the classification indicates whether the first frame is an unvoiced speech signal or not;
  
  processing the first frame by the processor in accordance with the classification of the first frame; and
  
  outputting a synthesized speech signal according to the processing of the first frame.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The method of claim 1, wherein the unvoicing parameter for the first frame is a combined parameter reflecting at least two characteristics of unvoiced speech in the first frame.
  - 3. The method of claim 2, wherein the combined parameter is computed from a periodicity parameter and a spectral tilt parameter.
  - 4. The method of claim 2, wherein the at least two characteristics of unvoiced speech comprise comprises signal periodicity characteristic and spectral tilt characteristic.
  - 5. The method of claim 1, wherein the second frame is previous to the first frame.
  - 6. The method of claim 5, wherein determining a classification of the first frame according to the computed difference comprises:
    - when the computed difference is greater than 0.1, the first frame is classified as an unvoiced speech signal;
      
      orwhen the computed difference is less than 0.05, the first frame is classified as not an unvoiced speech signal.
  - 7. The method of claim 6, wherein the classification of the first frame is the same as a previous frame of the first frame when the computed difference is not less than 0.05 and not greater than 0.1.
  - 8. The method of claim 1,wherein a weighting factor of the smoothed unvoicing parameter for the second frame is 0.9, and a weighting factor of the unvoicing parameter for the first frame is 0.1 when the smoothed unvoicing parameter for the second frame is greater than the unvoicing parameter for the first frame;
    - orwherein the weighting factor of the smoothed unvoicing parameter for the second frame is 0.99, and the weighting factor of the unvoicing parameter for the first frame is 0.01 when the smoothed unvoicing parameter for the second frame is not greater than the unvoicing parameter for the first frame.
  - 9. The method of claim 1, wherein the first frame and the second frame are frames or subframes of the speech signal.
  - 10. The method of claim 1, wherein processing the first frame in accordance with the classification of the first frame comprises:
    - processing the first frame with a first excitation when the classification of the first frame is the unvoiced speech;
      
      or processing the first frame with a second excitation when the classification of the first frame is not the unvoiced speech.
  - 11. The method of claim 10, wherein the first excitation is scaled by a first gain, and the second excitation is scaled by a second gain.
  - 12. A non-transitory computer readable storage medium storing instructions which, when executed by a processor, cause the processor to perform the steps of claim 1.

13. A speech processing apparatus comprising:
- a processor; and
  
  a non-transitory computer-readable storage medium storing computer instructions, that when executed by the processor, cause the processor to;
  
  determine an unvoicing parameter for a first frame of a speech signal, wherein the unvoicing parameter reflects a speech characteristic of the first frame;
  
  determine a smoothed unvoicing parameter for the first frame, wherein the smoothed unvoicing parameter is a weighted sum of the unvoicing parameter for the first frame and a smoothed unvoicing parameter for a second frame, and when the smoothed unvoicing parameter for the second frame is greater than the unvoicing parameter for the first frame, the smoothed unvoicing parameter for the second frame is weighted less heavily than the case when the smoothed unvoicing parameter for the second frame is not greater than the unvoicing parameter for the first frame;
  
  compute a difference between the unvoicing parameter for the first frame and the smoothed unvoicing parameter for the first frame;
  
  determine a classification of the first frame according to the computed difference, wherein the classification indicates whether the first frame is an unvoiced speech signal or not;
  
  process the first frame in accordance with the classification of the first frame; and
  
  output a synthesized speech signal according to the processing of the first frame.
- View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21)
- - 14. The apparatus of claim 13, wherein the unvoicing parameter for the first frame is a combined parameter reflecting a product of a periodicity parameter and a spectral tilt parameter.
  - 15. The apparatus of claim 13, wherein the second frame is previous to the first frame.
  - 16. The apparatus of claim 15, wherein the first frame is classified as an unvoiced speech signal when the computed difference is greater than 0.1;
    - or the first frame is classified as not an unvoiced speech signal when the computed difference is less than 0.05.
  - 17. The apparatus of claim 16, wherein when the computed difference is not less than 0.05 and not greater than 0.1, the classification of the first frame is the same as a previous frame of the first frame.
  - 18. The apparatus of claim 13,wherein a weighting factor of the smoothed unvoicing parameter for the second frame is 0.9, and a weighting factor of the unvoicing parameter for the first frame is 0.1 when the smoothed unvoicing parameter for the second frame is greater than the unvoicing parameter for the first frame;
    - orwherein the weighting factor of the smoothed unvoicing parameter for the second frame is 0.99, and the weighting factor of the unvoicing parameter for the first frame is 0.01 when the smoothed unvoicing parameter for the second frame is not greater than the unvoicing parameter for the first frame.
  - 19. The apparatus of claim 13, wherein the first frame and the second frame are frames or subframes of the speech signal.
  - 20. The apparatus of claim 13, wherein the processor is configured to:
    - process the first frame with a first excitation when the classification of the first frame is the unvoiced speech;
      
      or process the first frame with a second excitation when the classification of the first frame is not the unvoiced speech.
  - 21. The apparatus of claim 20, wherein the first excitation is scaled by a first gain, and the second excitation is scaled by a second gain.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Huawei Technologies Co., Ltd. (Huawei Investment & Holding Co., Ltd.)
Original Assignee
Huawei Technologies Co., Ltd. (Huawei Investment & Holding Co., Ltd.)
Inventors
Gao, Yang
Primary Examiner(s)
Shah, Paras D
Assistant Examiner(s)
KIM, JONATHAN C

Application Number

US15/391,247
Publication Number

US 20170110145A1
Time in Patent Office

588 Days
Field of Search

None
US Class Current
CPC Class Codes

G10L 19/22   Mode decision, i.e. based o...

G10L 25/78   Detection of presence or ab...

G10L 25/93   Discriminating between voic...

Unvoiced/voiced decision for speech processing

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

Unvoiced/voiced decision for speech processing

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links