Electrolaryngeal speech enhancement for telephony

US 20010033652A1
Filed: 02/07/2001
Published: 10/25/2001
Est. Priority Date: 02/08/2000
Status: Active Grant

First Claim

Patent Images

1. A method for processing an acoustic signal to separate the acoustic signal into a voiced (V) component corresponding to an electrolaryngeal source and an unvoiced (U) component corresponding to a turbulence source, the method comprising the steps of:

digitizing the acoustic signal to produce an original stream of numerical values;

extracting a segment of consecutive values from the original stream of numerical values to produce a first group of values covering two or more periods of the electrolaryngeal source;

performing a discrete Fourier transform on the first group of values to produce a discrete Fourier transform result;

extracting a second group of values from components of the discrete Fourier transform result which correspond to an electrolaryngeal fixed repetition rate, F₀, and harmonics thereof;

inverse-Fourier transforming the second group of values, to produce a representation of a segment of the V component;

concatenating multiple V component segments to form a V component sample stream; and

determining the U component by subtracting the V component sample stream from the original stream of numerical values.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A technique for separating an acoustic signal into a voiced (V) component corresponding to an electrolaryngeal source and an unvoiced (U) component corresponding to a turbulence source. The technique can be used to improve the quality of electrolaryngeal speech, and may be adapted for use in a special purpose telephone. A method according to the invention extracts a segment of consecutive values from the original stream of numerical values, and performs a discrete Fourier transform on the this first group of values. Next, a second group of values is extracted from components of the discrete Fourier transform result which correspond to an electrolaryngeal fixed repetition rate, F0, and harmonics thereof. An inverse-Fourier transform is applied to the second group of values, to produce a representation of a segment of the V component. Multiple V component segments are then concatenated to form a V component sample stream. Finally, the U component is determined by subtracting the V component sample stream from the original stream of numerical values.

14 Citations

View as Search Results

8 Claims

1. A method for processing an acoustic signal to separate the acoustic signal into a voiced (V) component corresponding to an electrolaryngeal source and an unvoiced (U) component corresponding to a turbulence source, the method comprising the steps of:
- digitizing the acoustic signal to produce an original stream of numerical values;
  
  extracting a segment of consecutive values from the original stream of numerical values to produce a first group of values covering two or more periods of the electrolaryngeal source;
  
  performing a discrete Fourier transform on the first group of values to produce a discrete Fourier transform result;
  
  extracting a second group of values from components of the discrete Fourier transform result which correspond to an electrolaryngeal fixed repetition rate, F₀, and harmonics thereof;
  
  inverse-Fourier transforming the second group of values, to produce a representation of a segment of the V component;
  
  concatenating multiple V component segments to form a V component sample stream; and
  
  determining the U component by subtracting the V component sample stream from the original stream of numerical values.
- View Dependent Claims (2, 3, 4, 5)
- - 2. A method as in claim 1 comprising the additional steps of:
    - determining segments of the input acoustic signal that correspond to inter-word segments.
  - 3. A method as in claim 2 wherein the step of determining inter-word segments includes a step of determining total power in the segments and characterizing such segments with relatively low power as inter-word segments.
  - 4. A method as in claim 2 additionaly comprising the steps of:
    - filtering the V component sample stream;
      
      for segments determined to be inter-word segments, setting the corresponding values of the V component sample stream to a zero value;
      
      adding the U component values to the altered V component sample stream values; and
      
      producing a process acoustic sample stream from the addition of the U values and altered V values.
  - 5. A method as in claim 1 wherein the steps are performed in a digital signal processor connected in line with a telephone apparatus.

6. A method for processing an acoustic signal to separate the acoustic signal into inter-word and non-inter-word segments, the method comprising the steps of:
- digitizing the acoustic signal to produce an original stream of numerical values;
  
  extracting a segment of consecutive values from the original stream of numerical values to produce a group of values;
  
  determining an average power level for the group of values; and
  
  if the average power level of the group of values is below a threshold value, determining that the group of values corresponds to an inter-word segment of the acoustic signal.
- View Dependent Claims (7, 8)
- - 7. A method as in claim 6 additionally comprising the step of:
    - if the average power level of the group of values is above a threshold value, determining that the group of values corresponds to a non-inter-word segment of the acoustic signal.
  - 8. A method as in claim 6 additionally comprising the step of:
    - setting the group of values to a zero value if they correspond to an inter-word segment.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Speech Technology and Applied Research Corporation
Original Assignee
Speech Technology and Applied Research Corporation
Inventors
Espy-Wilson, Carol, MacAuslan, Joel M., Goldhor, Richard, Chari, Venkatesh

Granted Patent

US 6,975,984 B2
Time in Patent Office

Days
Field of Search
US Class Current

379/406.13
CPC Class Codes

G10L 2021/0135   Voice conversion or morphing

G10L 2025/783   based on threshold decision

G10L 2025/937   Signal energy in various fr...

G10L 25/93   Discriminating between voic...

Electrolaryngeal speech enhancement for telephony

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

14 Citations

8 Claims

Specification

Solutions

Use Cases

Quick Links

Electrolaryngeal speech enhancement for telephony

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

14 Citations

8 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links