Electrolaryngeal speech enhancement for telephony

US 6,975,984 B2
Filed: 02/07/2001
Issued: 12/13/2005
Est. Priority Date: 02/08/2000
Status: Expired due to Fees

First Claim

Patent Images

1. A method for processing an acoustic signal to separate the acoustic signal into a voiced (V) component corresponding to an electrolaryngeal source and an unvoiced (U) component corresponding to a turbulence source, the method comprising the steps of:

digitizing the acoustic signal to produce an original stream of numerical values;

extracting a segment of consecutive values from the original stream of numerical values to produce a first group of values covering two or more periods of the electrolaryngeal source;

performing a discrete Fourier transform on the first group of values to produce a discrete Fourier transform result;

extracting a second group of values from components of the discrete Fourier transform result which correspond to an electrolaryngeal fixed repetition rate, F₀, and harmonics thereof;

inverse-Fourier transforming the second group of values, to produce a representation of a segment of the V component;

concatenating multiple V component segments to form a V component sample stream;

determining the U component by subtracting the V component sample stream from the original stream of numerical values;

determining segments of the input acoustic signal that correspond to inter-word segments;

filtering the V component sample stream;

for segments determined to be inter-word segments, setting the corresponding values of the V component sample stream to a zero value;

adding the U component values to the altered V component sample stream values; and

producing a processed acoustic sample stream from the addition of the U values and altered V values.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A technique for separating an acoustic signal into a voiced (V) component corresponding to an electrolaryngeal source and an unvoiced (U) component corresponding to a turbulence source. The technique can be used to improve the quality of electrolaryngeal speech, and may be adapted for use in a special purpose telephone. A method according to the invention extracts a segment of consecutive values from the original stream of numerical values, and performs a discrete Fourier transform on the this first group of values. Next, a second group of values is extracted from components of the discrete Fourier transform result which correspond to an electrolaryngeal fixed repetition rate, F0, and harmonics thereof. An inverse-Fourier transform is applied to the second group of values, to produce a representation of a segment of the V component. Multiple V component segments are then concatenated to form a V component sample stream. Finally, the U component is determined by subtracting the V component sample stream from the original stream of numerical values.

21 Citations

View as Search Results

7 Claims

1. A method for processing an acoustic signal to separate the acoustic signal into a voiced (V) component corresponding to an electrolaryngeal source and an unvoiced (U) component corresponding to a turbulence source, the method comprising the steps of:
- digitizing the acoustic signal to produce an original stream of numerical values;
  
  extracting a segment of consecutive values from the original stream of numerical values to produce a first group of values covering two or more periods of the electrolaryngeal source;
  
  performing a discrete Fourier transform on the first group of values to produce a discrete Fourier transform result;
  
  extracting a second group of values from components of the discrete Fourier transform result which correspond to an electrolaryngeal fixed repetition rate, F₀, and harmonics thereof;
  
  inverse-Fourier transforming the second group of values, to produce a representation of a segment of the V component;
  
  concatenating multiple V component segments to form a V component sample stream;
  
  determining the U component by subtracting the V component sample stream from the original stream of numerical values;
  
  determining segments of the input acoustic signal that correspond to inter-word segments;
  
  filtering the V component sample stream;
  
  for segments determined to be inter-word segments, setting the corresponding values of the V component sample stream to a zero value;
  
  adding the U component values to the altered V component sample stream values; and
  
  producing a processed acoustic sample stream from the addition of the U values and altered V values.
- View Dependent Claims (2, 3, 4, 5)
- - 2. A method as in claim 1 wherein the step of determining inter-word segments includes a step of determining total power in the segments and characterizing such segments with relatively low power as inter-word segments.
  - 3. A method as in claim 1 wherein the steps are performed in a digital signal processor connected in line with a telephone apparatus.
  - 4. A method as in claim 1 wherein the step of determining inter-word segments further comprises:
    - determining an average power level for the group of values; and
      
      if the average power level of the group of values is below a threshold value, determining that the group of values corresponds to an inter-word segment of the acoustic signal.
  - 5. A method as in claim 4 additionally comprising the step of:
    - if the average power level of the group of values is above a threshold value, determining that the group of values corresponds to a non-inter-word segment of the acoustic signal.

6. A method for processing an acoustic signal to separate the acoustic signal into a voiced (V) component corresponding to an electrolaryngeal source and an unvoiced (U) component corresponding to a turbulence source, the method comprising the steps of:
- digitizing the acoustic signal to produce an original stream of numerical values;
  
  extracting a segment of consecutive values from the original stream of numerical values to produce a first group of values covering two or more periods of the electrolaryngeal source;
  
  performing a discrete Fourier transform on the first group of values to produce a discrete Fourier transform result;
  
  extracting a second group of values from components of the discrete Fourier transform result which correspond to an electrolaryngeal fixed repetition rate, F₀, and harmonics thereof;
  
  inverse-Fourier transforming the second group of values, to produce a representation of a segment of the V component;
  
  concatenating multiple V component segments to form a V component sample stream;
  
  determining the U component by subtracting the V component sample stream from the original stream of numerical values;
  
  filtering the V component sample stream;
  
  setting corresponding selected values of the V component sample stream to a zero value;
  
  adding the U component values to the altered V component sample stream values; and
  
  producing a processed acoustic sample stream from the addition of the U values and altered V values.
- View Dependent Claims (7)
- - 7. A method as in claim 6 additionally comprising the step of:
    - setting the group of values to a zero value if they correspond to an inter-word segment.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Speech Technology and Applied Research Corporation
Original Assignee
Speech Technology and Applied Research Corporation
Inventors
Espy-Wilson, Carol, Chari, Venkatesh, MacAuslan, Joel M., Goldhor, Richard
Primary Examiner(s)
Young, W. R.
Assistant Examiner(s)
Wozniak, James S.

Application Number

US09/778,675
Publication Number

US 20010033652A1
Time in Patent Office

1,770 Days
Field of Search

704/208, 704/210, 704/219, 704/233, 704/226, 704/271, 381/70
US Class Current

704/208
CPC Class Codes

G10L 2021/0135   Voice conversion or morphing

G10L 2025/783   based on threshold decision

G10L 2025/937   Signal energy in various fr...

G10L 25/93   Discriminating between voic...

Electrolaryngeal speech enhancement for telephony

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

21 Citations

7 Claims

Specification

Use Cases

Quick Links

Others

Electrolaryngeal speech enhancement for telephony

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

21 Citations

7 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others