Format based speech reconstruction from noisy signals

US 9,020,818 B2
Filed: 08/20/2012
Issued: 04/28/2015
Est. Priority Date: 03/05/2012
Status: Expired due to Fees

First Claim

Patent Images

1. A method of reconstructing a speech signal from an audible signal using a formant-based codebook, the method comprising:

detecting one or more formants in an audible signal;

receiving a pitch estimate associated with the one or more detected formants;

selecting one or more codebook tuples from the formant-based codebook based at least on the one or more detected formants, wherein each codebook tuple includes a respective formant spectrum value and a respective one or more formant amplitude values, wherein the respective formant spectrum value is indicative of the spectral location of one or more formants associated with the codebook tuple, and the respective one or more formant amplitude values are indicative of the corresponding amplitudes of the one or more formants associated with the codebook tuple; and

interpolating the spectrum between the corresponding one or more formants associated with the one or more selected codebook tuples to generate a reconstructed speech signal using the received pitch estimate.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.

Citations

25 Claims

1. A method of reconstructing a speech signal from an audible signal using a formant-based codebook, the method comprising:
- detecting one or more formants in an audible signal;
  
  receiving a pitch estimate associated with the one or more detected formants;
  
  selecting one or more codebook tuples from the formant-based codebook based at least on the one or more detected formants, wherein each codebook tuple includes a respective formant spectrum value and a respective one or more formant amplitude values, wherein the respective formant spectrum value is indicative of the spectral location of one or more formants associated with the codebook tuple, and the respective one or more formant amplitude values are indicative of the corresponding amplitudes of the one or more formants associated with the codebook tuple; and
  
  interpolating the spectrum between the corresponding one or more formants associated with the one or more selected codebook tuples to generate a reconstructed speech signal using the received pitch estimate.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
- - 2. The method of claim 1, wherein the audible signal is noisy.
  - 3. The method of claim 1, further comprising receiving the audible signal from a single audio sensor device.
  - 4. The method of claim 1, further comprising receiving the audible signal from a plurality of audio sensors.
  - 5. The method of claim 1, wherein detecting one or more formants in the audible signal comprises:
    - converting the audible signal into a corresponding plurality of time-frequency units, wherein the time dimension of each time-frequency unit includes at least one of a plurality of sequential intervals spanning the duration of the audible signal, and wherein the frequency dimension of each time-frequency unit includes at least one of a plurality of sub-bands; and
      
      generating a respective detected tuple from the plurality of time-frequency units for each time interval, wherein the detected tuple includes a respective formant spectrum value and a respective one or more formant amplitude values, wherein the respective formant spectrum value is indicative of the spectral location of each of the one or more detected formants in the corresponding time interval, and the respective one or more formant amplitude values are indicative of the corresponding amplitudes of the one or more detected formants in the corresponding time interval.
  - 6. The method of claim 5, wherein the plurality of sub-bands is contiguously distributed throughout the frequency spectrum associated with human speech.
  - 7. The method of claim 6, wherein the spectral location of a particular formant is further characterized by at least one of a corresponding center frequency, a frequency offset and a bandwidth.
  - 8. The method of claim 6, wherein the spectrum associated with human speech includes a plurality of sub-bands, and wherein the formant spectrum value indicates which of the plurality of sub-bands includes the one or more detected formants detected.
  - 9. The method of claim 8, wherein formant spectrum value comprises a binary pattern.
  - 10. The method of claim 8, wherein the formant spectrum value comprises an encoded value.
  - 11. The method of claim 5, wherein selecting one or more codebook tuples from the formant-based codebook comprises:
    - identifying a respective codebook tuple that matches the respective detected tuple for each time interval by comparing the formant spectrum value of the respective detected tuple to the respective formant spectrum value of one or more codebook tuples.
  - 12. The method of claim 11, wherein the comparison of the formant spectrum value of the respective detected tuple to the respective formant spectrum value of one or more codebook tuples is fault tolerant.
  - 13. The method of claim 12, wherein the matching codebook tuple has a greater number of formants than the detected tuple.
  - 14. The method of claim 12, wherein the matching codebook tuple includes a respective formant at each spectral location in which the detected tuple has a respective formant.
  - 15. The method of claim 11, wherein selecting one or more codebook tuples from the formant-based codebook further comprises:
    - comparing the one or more formant amplitude values of the detected tuple to the corresponding one or more formant amplitudes values of the respective matching codebook tuple to determine whether the match should be accepted or rejected.
  - 16. The method of claim 5, wherein the match is rejected is one or more of the one or more formant amplitude values do not match the corresponding one or more formant amplitudes of the matched codebook tuple within a respective threshold.
  - 17. The method of claim 16, wherein the respective threshold is 10 dB.
  - 18. The method of claim 5, wherein in response to accepting the match, the method further comprises:
    - determining an indicator of whether any of the respective formants in the matched codebook tuple that are not present in the respective detected tuple for each time interval are likely to have been masked by noise in the audible signal;
      
      determining whether the indicator satisfies a threshold; and
      
      accepting the matched codebook tuple to reconstruct the speech signal for the corresponding time interval in response to determining that the indicator satisfies the threshold.
  - 19. The method of claim 18, wherein the threshold is 10 dB.
  - 20. The method of claim 1, further comprising:
    - tracking the amplitude of the audible signal; and
      
      normalizing the respective formant amplitude values of the corresponding one or more selected codebook tuples based at least on the tracked amplitude of the audible signal.
  - 21. The method of claim 1, wherein the interpolation of the spectrum between the corresponding one or more formants associated with the one or more selected codebook tuples comprises synthesizing one or more voice sections one glottal pulse at a time using an Inverse Fast Fourier Transform centered at each glottal pulse.
  - 22. The method of claim 1, wherein the interpolation of the spectrum between the corresponding one or more formants associated with the one or more selected codebook tuples comprises using a Lorentz function.

23. A voice reconstruction device operable to reconstruct a speech signal from an audible signal using a formant based codebook, the device comprising:
- a formant detection module configured to detect one or more formants in an audible signal;
  
  a tuple selection module configured to select one or more codebook tuples from the formant-based codebook based at least on the one or more detected formants, wherein each codebook tuple includes a respective formant spectrum value and a respective one or more formant amplitude values, wherein the respective formant spectrum value is indicative of the spectral location of one or more formants associated with the codebook tuple, and the respective one or more formant amplitude values are indicative of the corresponding amplitudes of the one or more formants associated with the codebook tuple; and
  
  a synthesis module configured to interpolate the spectrum between the corresponding one or more formants associated with the one or more selected codebook tuples to generate a reconstructed speech signal using a pitch estimate.

24. A voice reconstruction device operable to reconstruct a speech signal from an audible signal using a formant based codebook, the device comprising:
- means for detecting one or more formants in an audible signal;
  
  means for selecting one or more codebook tuples from the formant-based codebook based at least on the one or more detected formants, wherein each codebook tuple includes a respective formant spectrum value and a respective one or more formant amplitude values, wherein the respective formant spectrum value is indicative of the spectral location of one or more formants associated with the codebook tuple, and the respective one or more formant amplitude values are indicative of the corresponding amplitudes of the one or more formants associated with the codebook tuple; and
  
  means for interpolating the spectrum between the corresponding one or more formants associated with the one or more selected codebook tuples to generate a reconstructed speech signal using a pitch estimate.

25. A voice reconstruction device operable to reconstruct a speech signal from an audible signal using a formant based codebook, the device comprising:
- a processor; and
  
  a memory including instructions, that when executed by the processor cause the device to;
  
  detect one or more formants in an audible signal;
  
  select one or more codebook tuples from the formant-based codebook based at least on the one or more detected formants, wherein each codebook tuple includes a respective formant spectrum value and a respective one or more formant amplitude values, wherein the respective formant spectrum value is indicative of the spectral location of one or more formants associated with the codebook tuple, and the respective one or more formant amplitude values are indicative of the corresponding amplitudes of the one or more formants associated with the codebook tuple; and
  
  interpolate the spectrum between the corresponding one or more formants associated with the one or more selected codebook tuples to generate a reconstructed speech signal using a pitch estimate.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Malaspina Labs (Barbados), Inc.
Original Assignee
Malaspina Labs (Barbados), Inc.
Inventors
Zakarauskas, Pierre, Escott, Alexander, Chu, Clarence S. H., Stevenson, Shawn E.
Primary Examiner(s)
GUERRA-ERAZO, EDGAR X

Application Number

US13/589,977
Publication Number

US 20130231924A1
Time in Patent Office

981 Days
Field of Search

704/243, 704/244, 704/246, 704/256, 704/256.7, 704/235, 704/236, 704/240, 704/255, 704/277
US Class Current

704/243
CPC Class Codes

G10L 19/0017   Lossless audio signal codin...

G10L 19/012   Comfort noise or silence co...

G10L 2019/0007   Codebook element generation

G10L 21/02   Speech enhancement, e.g. no...

G10L 25/15   the extracted parameters be...

G10L 25/75   for modelling vocal tract p...

H04R 25/00   Deaf-aid sets , i.e. electr...

Format based speech reconstruction from noisy signals

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

Citations

25 Claims

Specification

Solutions

Use Cases

Quick Links

Format based speech reconstruction from noisy signals

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

25 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links