Method and apparatus for high resolution speech reconstruction
First Claim
Patent Images
1. A method of identifying a clean speech signal from a noisy speech signal, the method comprising:
- a processor identifying a set of log-magnitude frequency values for each of a plurality of frames that represent the noisy speech signal;
the processor filtering the log-magnitude frequency values of the noisy speech signal to smooth the log-magnitude frequency values over time to form filtered noisy values by applying the log magnitude frequency values of the noisy speech signal to a Finite Impulse Responsive Filter having a set of filter parameters wherein at least one of the filter parameters of the set of filter parameters differs from another of the filter parameters of the set of filter parameters;
the processor determining parameters of at least one posterior probability distribution of at least one component of a clean signal value based on the set of filtered noisy values without applying a frequency-based transform to the set of filtered noisy values, the posterior probability distribution providing the probability of a log-magnitude frequency value for a clean speech signal given a filtered noisy value;
the processor using the parameters of the posterior probability distribution to estimate a set of log-magnitude frequency values for a clean speech signal; and
the processor using the log-magnitude values for the clean speech signal to produce an output clean speech signal.
2 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus identify a clean speech signal from a noisy speech signal. The noisy speech signal is converted into frequency values in the frequency domain. The parameters of at least one posterior probability of at least one component of a clean signal value are then determined based on the frequency values. This determination is made without applying a frequency-based filter to the frequency values. The parameters of the posterior probability distribution are then used to estimate a set of frequency values for the clean speech signal. A clean speech signal is then constructed from the estimated set of frequency values.
-
Citations
16 Claims
-
1. A method of identifying a clean speech signal from a noisy speech signal, the method comprising:
-
a processor identifying a set of log-magnitude frequency values for each of a plurality of frames that represent the noisy speech signal; the processor filtering the log-magnitude frequency values of the noisy speech signal to smooth the log-magnitude frequency values over time to form filtered noisy values by applying the log magnitude frequency values of the noisy speech signal to a Finite Impulse Responsive Filter having a set of filter parameters wherein at least one of the filter parameters of the set of filter parameters differs from another of the filter parameters of the set of filter parameters; the processor determining parameters of at least one posterior probability distribution of at least one component of a clean signal value based on the set of filtered noisy values without applying a frequency-based transform to the set of filtered noisy values, the posterior probability distribution providing the probability of a log-magnitude frequency value for a clean speech signal given a filtered noisy value; the processor using the parameters of the posterior probability distribution to estimate a set of log-magnitude frequency values for a clean speech signal; and the processor using the log-magnitude values for the clean speech signal to produce an output clean speech signal. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computer storage medium storing computer-executable instructions for performing steps comprising:
-
identifying log-magnitude frequency values for each of a plurality of frames that represent a noisy speech signal; applying the log-magnitude frequency values that represent frames of the noisy speech signal to a Finite Impulse Response filter having a set of filter parameters wherein one of the filter parameters of the set of filter parameters differs from another filter parameter of the set of filter parameters to provide time-based filtering and to produce filtered values representing noisy speech; determining a posterior probability based on the filtered values, wherein a frequency-based transform is not applied before the filtered values are used to determine the posterior probability and wherein the posterior probability provides the probability of log-magnitude frequency values for a clean speech signal given the filtered values; using the posterior probability to estimate a log-magnitude frequency value for a frame of a clean speech signal; and using the log-magnitude frequency value for the frame of the clean speech signal to produce an output clean speech signal. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
Specification