HYBRID WAVEFORM-CODED AND PARAMETRIC-CODED SPEECH ENHANCEMENT
First Claim
1. A method, comprising:
- receiving mixed audio content, in a reference audio channel representation, that are distributed over a plurality of audio channels of the reference audio channel representation, the mixed audio content having a mix of speech content and non-speech audio content;
transforming one or more portions of the mixed audio content that are distributed over two or more non-Mid/Side (non-M/S) channels in the plurality of audio channels of the reference audio channel representation into one or more portions of transformed mixed audio content in an M/S audio channel representation that are distributed over one or more channels of the M/S audio channel representation, wherein the M/S audio channel representation comprises at least a mid-channel and a side-channel, wherein the mid-channel represents a weighted or non-weighted sum of two channels of the reference audio channel representation, and wherein the side-channel represents a weighted or non-weighted difference of two channels of the reference audio channel representation;
determining metadata for speech enhancement of the one or more portions of transformed mixed audio content in the M/S audio channel representation; and
generating an audio signal that comprises the mixed audio content and the metadata for speech enhancement of the one or more portions of transformed mixed audio content in the M/S audio channel representation;
wherein the method is performed by one or more computing devices.
2 Assignments
0 Petitions
Accused Products
Abstract
A method for hybrid speech enhancement which employs parametric-coded enhancement (or blend of parametric-coded and waveform-coded enhancement) under some signal conditions and waveform-coded enhancement (or a different blend of parametric-coded and waveform-coded enhancement) under other signal conditions. Other aspects are methods for generating a bitstream indicative of an audio program including speech and other content, such that hybrid speech enhancement can be performed on the program, a decoder including a buffer which stores at least one segment of an encoded audio bitstream generated by any embodiment of the inventive method, and a system or device (e.g., an encoder or decoder) configured (e.g., programmed) to perform any embodiment of the inventive method. At least some of speech enhancement operations are performed by a recipient audio decoder with Mid/Side speech enhancement metadata generated by an upstream audio encoder.
163 Citations
34 Claims
-
1. A method, comprising:
-
receiving mixed audio content, in a reference audio channel representation, that are distributed over a plurality of audio channels of the reference audio channel representation, the mixed audio content having a mix of speech content and non-speech audio content; transforming one or more portions of the mixed audio content that are distributed over two or more non-Mid/Side (non-M/S) channels in the plurality of audio channels of the reference audio channel representation into one or more portions of transformed mixed audio content in an M/S audio channel representation that are distributed over one or more channels of the M/S audio channel representation, wherein the M/S audio channel representation comprises at least a mid-channel and a side-channel, wherein the mid-channel represents a weighted or non-weighted sum of two channels of the reference audio channel representation, and wherein the side-channel represents a weighted or non-weighted difference of two channels of the reference audio channel representation; determining metadata for speech enhancement of the one or more portions of transformed mixed audio content in the M/S audio channel representation; and generating an audio signal that comprises the mixed audio content and the metadata for speech enhancement of the one or more portions of transformed mixed audio content in the M/S audio channel representation; wherein the method is performed by one or more computing devices. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 33, 34)
-
-
10-16. -16. (canceled)
-
17. A method, comprising:
-
receiving an audio signal that comprises mixed audio content in a reference audio channel representation and metadata for speech enhancement, the mixed audio content having a mix of speech content and non-speech audio content; transforming one or more portions of the mixed audio content that spread over two or more non-M/S channels in a plurality of audio channels of the reference audio channel representation into one or more portions of transformed mixed audio content in an M/S audio channel representation that spread over one or more M/S channels of the M/S audio channel representation, wherein the M/S audio channel representation comprises at least a mid-channel and a side-channel, wherein the mid-channel represents a weighted or non-weighted sum of two channels of the reference audio channel representation, and wherein the side-channel represents a weighted or non-weighted difference of two channels of the reference audio channel representation; performing one or more speech enhancement operations, based on the metadata for speech enhancement, on the one or more portions of transformed mixed audio content in the M/S audio channel representation to generate one or more portions of enhanced speech content in the M/S representation; combining the one or more portions of transformed mixed audio content in the M/S audio channel representation with the one or more portions of enhanced speech content in the M/S representation to generate one or more portions of speech enhanced mixed audio content in the M/S representation; wherein the method is performed by one or more computing devices. - View Dependent Claims (18)
-
-
19-32. -32. (canceled)
Specification