Methods and apparatus for decoding based on speech enhancement metadata
First Claim
1. A method, comprising:
- receiving mixed audio content, wherein the mixed audio content includes at least a mid-channel mixed content signal and a side-channel mixed content signal, wherein the mid-channel signal represents a weighted or non-weighted sum of two channels of a reference audio channel representation, and wherein the side-channel signal represents a weighted or non-weighted difference of two channels of the reference audio channel representation;
decoding, by an audio decoder, the mid-channel signal and the side-channel signal into a left channel signal and a right channel signal, wherein the decoding includes decoding based on speech enhancement metadata, wherein the speech enhancement metadata includes a preference flag which indicates at least a type of speech enhancement operation to be performed on the mid-channel signal and the side-channel signal during decoding, and wherein the enhancement metadata further indicates a first type of speech enhancement for the mid-channel signal and a second type of speech enhancement of the mid-channel signal; and
generating an audio signal that comprises the left channel signal and the right channel signal for the one or more portions of the decoded mid channel signal and side-channel signal of the mixed audio content,wherein the method is performed by one or more computing devices.
1 Assignment
0 Petitions
Accused Products
Abstract
A method for hybrid speech enhancement which employs parametric-coded enhancement (or blend of parametric-coded and waveform-coded enhancement) under some signal conditions and waveform-coded enhancement (or a different blend of parametric-coded and waveform-coded enhancement) under other signal conditions. Other aspects are methods for generating a bitstream indicative of an audio program including speech and other content, such that hybrid speech enhancement can be performed on the program, a decoder including a buffer which stores at least one segment of an encoded audio bitstream generated by any embodiment of the inventive method, and a system or device (e.g., an encoder or decoder) configured (e.g., programmed) to perform any embodiment of the inventive method. At least some of speech enhancement operations are performed by a recipient audio decoder with Mid/Side speech enhancement metadata generated by an upstream audio encoder.
152 Citations
15 Claims
-
1. A method, comprising:
-
receiving mixed audio content, wherein the mixed audio content includes at least a mid-channel mixed content signal and a side-channel mixed content signal, wherein the mid-channel signal represents a weighted or non-weighted sum of two channels of a reference audio channel representation, and wherein the side-channel signal represents a weighted or non-weighted difference of two channels of the reference audio channel representation; decoding, by an audio decoder, the mid-channel signal and the side-channel signal into a left channel signal and a right channel signal, wherein the decoding includes decoding based on speech enhancement metadata, wherein the speech enhancement metadata includes a preference flag which indicates at least a type of speech enhancement operation to be performed on the mid-channel signal and the side-channel signal during decoding, and wherein the enhancement metadata further indicates a first type of speech enhancement for the mid-channel signal and a second type of speech enhancement of the mid-channel signal; and generating an audio signal that comprises the left channel signal and the right channel signal for the one or more portions of the decoded mid channel signal and side-channel signal of the mixed audio content, wherein the method is performed by one or more computing devices. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A non-transitory computer readable storage medium, comprising software instructions, which when executed by one or more processors cause performance of any one of the methods recited in 1-7.
-
9. An apparatus, comprising:
-
a receiver configured to receive mixed audio content, wherein the mixed audio content includes at least a mid-channel mixed content signal and a side-channel mixed content signal, wherein the mid-channel signal represents a weighted or non-weighted sum of two channels of a reference audio channel representation, and wherein the side-channel signal represents a weighted or non-weighted difference of two channels of the reference audio channel representation; a decoder configured to decode the mid-channel signal and the side-channel signal into a left channel signal and a right channel signal, wherein the decoding includes decoding based on speech enhancement metadata, wherein the speech enhancement metadata includes a preference flag which indicates at least a type of speech enhancement operation to be performed on the mid-channel signal and the side-channel signal during decoding, and wherein the enhancement metadata further indicates a first type of speech enhancement for the mid-channel signal and a second type of speech enhancement of the mid-channel signal; and a processor configured to generate an audio signal that comprises the left channel signal and the right channel signal for the one or more portions of the decoded mid channel signal and side-channel signal of the mixed audio content. - View Dependent Claims (10, 11, 12, 13, 14, 15)
-
Specification