Speech enhancement techniques on the power spectrum
First Claim
1. A method for providing spectral speech descriptions to be used for synthesis of a speech utterance comprising the steps ofreceiving at least one spectral envelope input representation corresponding to the speech utterance, where the at least one spectral envelope input representation includes at least one of at least one formant and at least one spectral trough in the form of at least one of a local peak and a local valley in the spectral envelope input representation,extracting from the at least one spectral envelope input representation a rapidly varying input component, where the rapidly varying input component is generated, at least in part, by removing from the at least one spectral envelope input representation a slowly varying input component in the form of a non-constant coarse shape of the at least one spectral envelope input representation and by keeping the fine details of the at least one spectral envelope input representation, where the details contain at least one of a peak or a valley,creating a rapidly varying final component, where the rapidly varying final component is derived from the rapidly varying input component by manipulating at least one of at least one peak and at least one valley,combining the rapidly varying final component with one of the slowly varying input component and the spectral envelope input representation to form a spectral envelope final representation, and providing a spectral speech description output vector to be used for synthesis of a speech utterance, where at least a part of the spectral speech description output vector is derived from the spectral envelope final representation.
7 Assignments
0 Petitions
Accused Products
Abstract
The method provides a spectral speech description to be used for synthesis of a speech utterance, where at least one spectral envelope input representation is received. In one solution the improvement is made by manipulation an extremum, i.e. a peak or a valley, in the rapidly varying component of the spectral envelope representation. The rapidly varying component of the spectral envelope representation is manipulated to sharpen and/or accentuate extrema after which it is merged back with the slowly varying component or the spectral envelope input representation to create an enhanced spectral envelope final representation. In other solutions a complex spectrum envelope final representation is created with phase information derived from one of the group delay representation of a real spectral envelope input representation corresponding to a short-time speech signal and a transformed phase component of the discrete complex frequency domain input representation corresponding to the speech utterance.
-
Citations
9 Claims
-
1. A method for providing spectral speech descriptions to be used for synthesis of a speech utterance comprising the steps of
receiving at least one spectral envelope input representation corresponding to the speech utterance, where the at least one spectral envelope input representation includes at least one of at least one formant and at least one spectral trough in the form of at least one of a local peak and a local valley in the spectral envelope input representation, extracting from the at least one spectral envelope input representation a rapidly varying input component, where the rapidly varying input component is generated, at least in part, by removing from the at least one spectral envelope input representation a slowly varying input component in the form of a non-constant coarse shape of the at least one spectral envelope input representation and by keeping the fine details of the at least one spectral envelope input representation, where the details contain at least one of a peak or a valley, creating a rapidly varying final component, where the rapidly varying final component is derived from the rapidly varying input component by manipulating at least one of at least one peak and at least one valley, combining the rapidly varying final component with one of the slowly varying input component and the spectral envelope input representation to form a spectral envelope final representation, and providing a spectral speech description output vector to be used for synthesis of a speech utterance, where at least a part of the spectral speech description output vector is derived from the spectral envelope final representation.
Specification