System and method for the analysis and synthesis of periodic and non-periodic components of speech signals

US 10,354,671 B1
Filed: 02/21/2018
Issued: 07/16/2019
Est. Priority Date: 02/21/2017
Status: Active Grant

First Claim

Patent Images

1. A voice coder comprising:

a microphone for recording a speech signal from a user;

a frame generator configured to parse the speech signal into a plurality of speech frames;

a pitch detector configured to determine a fundamental period of each of the plurality of speech frames;

a Fourier Transform module configured to generate a spectra for each of the plurality of speech frames;

a sub-band generator configured to parse the spectra of each speech frame into a plurality of sub-bands;

a Hilbert Transform module configured to transform each of the plurality of sub-bands into a time-domain envelope signal;

a similarity module configured to generate a plurality of sub-band voicing factors, wherein each sub-band voicing factor indicates a harmonicity of one of the plurality of sub-bands, and each sub-band voicing factor is based on a periodicity of said time-domain envelope signals associated with one of the plurality of sub-bands;

a frame synthesizer configured to generate a plurality of recomposed frames, each recomposed frame being based on;

a) the spectra for one of said plurality of speech frames, andb) the sub-band voicing factors associated with the plurality of sub-bands for one of said plurality of speech frames; and

a waveform generator configured to generate a recomposed speech signal from the plurality of recomposed frames.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A voice coder configured to resolve periodic and aperiodic components of spectra is disclosed. The method of voice coding includes parsing the speech signal into a plurality of speech frames; for each of the plurality of speech frames: (a) generating the spectra for the speech frame, (b) parsing the spectra of the speech frame into a plurality of sub-bands, (c) transforming each of the plurality of sub-bands into a time-domain envelope signal, and (d) generating a plurality of sub-band voicing factors, wherein each sub-band voicing factor indicates the harmonicity of one of the plurality of sub-bands, and each sub-band voicing factor is based on the periodicity of one of said time-domain envelope signals associated with one of the plurality of sub-bands. The voice coder may regenerate the speech signal by generating a plurality of recomposed frames, each recomposed frame being based on: (a) the spectra for one of said plurality of speech frames, and (b) the sub-band voicing factors associated with the plurality of sub-bands for one of said plurality of speech frames; and then generating a recomposed speech signal from the plurality of recomposed frames.

7 Citations

View as Search Results

12 Claims

1. A voice coder comprising:
- a microphone for recording a speech signal from a user;
  
  a frame generator configured to parse the speech signal into a plurality of speech frames;
  
  a pitch detector configured to determine a fundamental period of each of the plurality of speech frames;
  
  a Fourier Transform module configured to generate a spectra for each of the plurality of speech frames;
  
  a sub-band generator configured to parse the spectra of each speech frame into a plurality of sub-bands;
  
  a Hilbert Transform module configured to transform each of the plurality of sub-bands into a time-domain envelope signal;
  
  a similarity module configured to generate a plurality of sub-band voicing factors, wherein each sub-band voicing factor indicates a harmonicity of one of the plurality of sub-bands, and each sub-band voicing factor is based on a periodicity of said time-domain envelope signals associated with one of the plurality of sub-bands;
  
  a frame synthesizer configured to generate a plurality of recomposed frames, each recomposed frame being based on;
  
  a) the spectra for one of said plurality of speech frames, andb) the sub-band voicing factors associated with the plurality of sub-bands for one of said plurality of speech frames; and
  
  a waveform generator configured to generate a recomposed speech signal from the plurality of recomposed frames.
- View Dependent Claims (2, 3, 4, 5, 7, 8, 9, 10, 11)
- - 2. The voice coder of claim 1, wherein the Fourier Transform module is further configured to generate Mel Cepstral coefficients representing the spectra for each of the plurality of speech frames.
  - 3. The voice coder of claim 1, wherein each of said sub-band voicing factors is based on a measure of the periodicity in the time-domain envelope signal.
  - 4. The voice coder of claim 3, wherein said periodicity is based on a correlation between the time-domain envelope signal and a time-shifted representation of the said time-domain envelope signal.
  - 5. The voice coder of claim 4, wherein said time-domain envelope signal is time-shifted by a time corresponding to the fundamental period of an associated speech frame.
  - 7. The method of claim 5, further comprising:
    - transmitting the recomposed speech signal to a speaker for playback to a user.
  - 8. The method of claim 5, further comprising:
    - generate Mel Cepstral coefficients representing the spectra for each of the plurality of speech frames.
  - 9. The method of claim 5, wherein each of said sub-band voicing factors is based on a periodicity in the time-domain envelope signal.
  - 10. The method of claim 9, wherein said periodicity is based on a correlation between the time-domain envelope signal and a time-shifted representation of the said time-domain envelope signal.
  - 11. The method of claim 10, wherein said time-domain envelope signal is time-shifted by a time corresponding to the fundamental period of the associated speech frame.

6. A method of voice coding, the method comprising:
- recording a speech signal from a user with a microphone;
  
  parsing the speech signal into a plurality of speech frames;
  
  determining a fundamental period of each of the plurality of speech frames,generating a spectra for each of the plurality of speech frames;
  
  parsing the spectra of each speech frame into a plurality of sub-bands,transforming each of the plurality of sub-bands into a time-domain envelope signal;
  
  generating a plurality of sub-band voicing factors, wherein each sub-band voicing factor indicates a harmonicity of one of the plurality of sub-bands, and each sub-band voicing factor is based on a periodicity of said time-domain envelope signals associated with one of the plurality of sub-bands;
  
  generating a plurality of recomposed frames, each recomposed frame being based on;
  
  a) the spectra for one of said plurality of speech frames, andb) the sub-band voicing factors associated with the plurality of sub-bands for one of said plurality of speech frames, andgenerating a recomposed speech signal from the plurality of recomposed frames.

12. A method of filtering a speech signal with a voice coder, the method comprising:
- recording a speech signal from a user with a microphone;
  
  parsing the speech signal into a plurality of speech frames;
  
  determining a fundamental period of each of the plurality of speech frames;
  
  for each of the plurality of speech frames;
  
  a) generating a spectra for the speech frame;
  
  b) parsing the spectra of the speech frame into a plurality of sub-bands,c) transforming each of the plurality of sub-bands into a time-domain envelope signal; and
  
  d) generating a plurality of sub-band voicing factors, wherein each sub-band voicing factor indicates a harmonicity of one of the plurality of sub-bands, and each sub-band voicing factor is based on a periodicity of said time-domain envelope signals associated with one of the plurality of sub-bands;
  
  generating a plurality of recomposed frames, each recomposed frame being based on;
  
  a) the spectra for one of said plurality of speech frames; and
  
  b) the plurality of sub-band voicing factors associated with the plurality of sub-bands for one of said plurality of speech frames; and
  
  generating a recomposed speech signal from the plurality of recomposed frames.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
ObEN, Inc.
Original Assignee
ObEN, Inc.
Inventors
Kaewtip, Kantapon, Villavicencio, Fernando, Harvilla, Mark
Primary Examiner(s)
Riley, Marcus T

Application Number

US15/901,864
Time in Patent Office

510 Days
Field of Search

None
US Class Current
CPC Class Codes

G10L 15/1822   Parsing for meaning underst...

G10L 15/22   Procedures used during a sp...

G10L 19/00   Speech or audio signals ana...

G10L 19/0204   using subband decomposition

G10L 19/093   using sinusoidal excitation...

G10L 21/038   using band spreading techni...

G10L 25/24   the extracted parameters be...

G10L 25/90   Pitch determination of spee...

G10L 25/93   Discriminating between voic...

System and method for the analysis and synthesis of periodic and non-periodic components of speech signals

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

7 Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for the analysis and synthesis of periodic and non-periodic components of speech signals

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

7 Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links