Systems and methods for restoration of speech components
First Claim
1. A method for restoring speech components of an audio signal, the method comprising:
- receiving an audio signal after it has been processed for noise suppression;
determining distorted frequency regions and undistorted frequency regions in the received audio signal that has been processed for noise suppression, the distorted frequency regions including regions of the audio signal in which speech distortion is present due to the noise suppression processing; and
performing one or more iterations using a model to generate predictions of a restored version of the audio signal, the model being configured to modify the audio signal so as to restore the speech components in the distorted frequency regions.
4 Assignments
0 Petitions
Accused Products
Abstract
A method for restoring distorted speech components of an audio signal distorted by a noise reduction or a noise cancellation includes determining distorted frequency regions and undistorted frequency regions in the audio signal. The distorted frequency regions include regions of the audio signal in which a speech distortion is present. Iterations are performed using a model to refine predictions of the audio signal at distorted frequency regions. The model is configured to modify the audio signal and may include deep neural network trained using spectral envelopes of clean or undamaged audio signals. Before each iteration, the audio signal at the undistorted frequency regions is restored to values of the audio signal prior to the first iteration; while the audio signal at distorted frequency regions is refined starting from zero at the first iteration. Iterations are ended when discrepancies of audio signal at undistorted frequency regions meet pre-defined criteria.
-
Citations
20 Claims
-
1. A method for restoring speech components of an audio signal, the method comprising:
-
receiving an audio signal after it has been processed for noise suppression; determining distorted frequency regions and undistorted frequency regions in the received audio signal that has been processed for noise suppression, the distorted frequency regions including regions of the audio signal in which speech distortion is present due to the noise suppression processing; and performing one or more iterations using a model to generate predictions of a restored version of the audio signal, the model being configured to modify the audio signal so as to restore the speech components in the distorted frequency regions. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system for restoring speech components of an audio signal, the system comprising:
-
at least one processor; and a memory communicatively coupled with the at least one processor, the memory storing instructions, which when executed by the at least one processor performs a method comprising; receiving an audio signal after it has been processed for noise suppression; determining distorted frequency regions and undistorted frequency regions in the received audio signal that has been processed for noise suppression, the distorted frequency regions including regions of the audio signal in which speech distortion is present due to the noise suppression processing; and performing one or more iterations using a model to generate predictions of a restored version of the audio signal, the model being configured to modify the audio signal so as to restore the speech components in the distorted frequency regions. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A non-transitory computer-readable storage medium having embodied thereon instructions, which when executed by at least one processor, perform steps of a method, the method comprising:
-
receiving an audio signal after it has been processed for noise suppression; determining distorted frequency regions and undistorted frequency regions in the received audio signal that has been processed for noise suppression, the distorted frequency regions including regions of the audio signal in which speech distortion is present due to the noise suppression processing; and performing one or more iterations using a model to refine predictions of the audio signal at the distorted frequency regions, the model being configured to modify the audio signal so as to restore speech components in the distorted frequency regions.
-
Specification