Object oriented audio coding
First Claim
Patent Images
1. A method of processing digitized audio signals, comprising at least:
- a coding phase in which a signal to be processed, organized into a sequence of frames comprising a predetermined number of samples, is split into a plurality of frequency bands which can be independently coded, and a coded signal is generated including information relevant to signals in at least selected ones of said frequency bands, the coding taking place according to an embedded coding technique such that the coded signal comprises a basic layer, containing the minimum amount of information ("core information") needed for decoding and corresponding to a minimum bit rate, a total layer, containing the whole of the coded information and corresponding to a maximum bit rate, and a plurality of intermediate layers which contribute to the coded signal by respective information blocks ("enhancement information") coding respective signal portions that cannot be represented by the only core information, and which cause an increase of the bit rate of the coded signal by successive steps from the basic layer to the total layer, the basic layer being generated by a first coding step and each block of enhancement information being generated by a respective second coding step; and
a decoding phase, in which the information relevant to the different frequency bands included in the coded signal is independently decoded, in such a manner that for a frequency band for which both enhancement information blocks and the core information are to be decoded, the coded signals are submitted to a set of first decoding steps, the number of which is the same as that of the second coding steps performed for that band and in each of which one enhancement information block is decoded, and to a second decoding step in which the core information is decoded, whereas for a frequency band for which only the core information is to be decoded, the coded signals are submitted to the second decoding step only; and
the decoded signals relevant to the different bands are recombined to build a reconstructed signal with bandwidth characteristics corresponding to those of the original signal;
characterized in that, during said coding phase, a two-stage classification is performed by which each audio signal to be coded in a given frame is allotted to one out of a plurality of abstract and to one out of a plurality of concrete classes of said one abstract class, the concrete classes being related with the characteristics of a signal portion and identifying elementary audio objects present in the frame and the abstract classes being related with the nature of an audio signal and identifying macro-objects resulting from a combination of elementary audio objects;
in that said first coding step for a given audio object is performed by means of a first coding algorithm selected out of a plurality of first coding algorithms and any second coding step for that given audio object is performed by means of a respective second coding algorithm selected out of a plurality of second coding algorithms, the choice amongst the plurality of said first and respectively second coding algorithms depending at least on the results of said two step classification;
the coding phase generating, for each object, an object bit stream, containing all information relevant to a same concrete class for that audio signal in that frame, and a macro-object bit stream combining bit streams of different objects of a same to abstract class or different abstract classes and having bit-rate and bandwidth characteristics which depend on the choices made for said first and said second algorithms and on configuration information passed from a user equipment (US) to coding devices (AC) and/or on control information passed from a transmission system (SY) to the coding device;
in that the method further comprises, between the coding and decoding phases, a phase of manipulation of the bit stream generated by said coding phase, for the scaling of the coded bit stream in dependence of information about the abstract and concrete classes, included in the coded bit stream, and of said configuration and control information;
and in that in said decoding phase, said first decoding step is performed by means of a respective algorithm complementary to the second algorithm selected in the coding phase to generate the enhancement information block to be decoded in that step, and the second decoding step is performed according to an algorithm complementary to the first algorithm selected in the first coding step;
each of said first and second decoding algorithms being selected out of a plurality of first and second decoding algorithms, complementary each to one of said second and first coding algorithms, respectively, according to information provided with the abstract and concrete class and/or configuration information provided in a set up phase.
3 Assignments
0 Petitions
Accused Products
Abstract
Audio sources are coded by recognizing different classes of audio such as speech and music. The classes are used to select between coding algorithms and to provide object definitions. Objects have abstract and concrete classes which may further rely on parameters produced by linear prediction and subband filters to provide a frame-based bit stream of information. Each object in the bit stream has layers of information such as basic bit rate, coding parameters and enhancement parameters. The layers of information in each object allow altering selected parameters to manipulate audio signals.
-
Citations
56 Claims
-
1. A method of processing digitized audio signals, comprising at least:
-
a coding phase in which a signal to be processed, organized into a sequence of frames comprising a predetermined number of samples, is split into a plurality of frequency bands which can be independently coded, and a coded signal is generated including information relevant to signals in at least selected ones of said frequency bands, the coding taking place according to an embedded coding technique such that the coded signal comprises a basic layer, containing the minimum amount of information ("core information") needed for decoding and corresponding to a minimum bit rate, a total layer, containing the whole of the coded information and corresponding to a maximum bit rate, and a plurality of intermediate layers which contribute to the coded signal by respective information blocks ("enhancement information") coding respective signal portions that cannot be represented by the only core information, and which cause an increase of the bit rate of the coded signal by successive steps from the basic layer to the total layer, the basic layer being generated by a first coding step and each block of enhancement information being generated by a respective second coding step; and a decoding phase, in which the information relevant to the different frequency bands included in the coded signal is independently decoded, in such a manner that for a frequency band for which both enhancement information blocks and the core information are to be decoded, the coded signals are submitted to a set of first decoding steps, the number of which is the same as that of the second coding steps performed for that band and in each of which one enhancement information block is decoded, and to a second decoding step in which the core information is decoded, whereas for a frequency band for which only the core information is to be decoded, the coded signals are submitted to the second decoding step only; and
the decoded signals relevant to the different bands are recombined to build a reconstructed signal with bandwidth characteristics corresponding to those of the original signal;characterized in that, during said coding phase, a two-stage classification is performed by which each audio signal to be coded in a given frame is allotted to one out of a plurality of abstract and to one out of a plurality of concrete classes of said one abstract class, the concrete classes being related with the characteristics of a signal portion and identifying elementary audio objects present in the frame and the abstract classes being related with the nature of an audio signal and identifying macro-objects resulting from a combination of elementary audio objects; in that said first coding step for a given audio object is performed by means of a first coding algorithm selected out of a plurality of first coding algorithms and any second coding step for that given audio object is performed by means of a respective second coding algorithm selected out of a plurality of second coding algorithms, the choice amongst the plurality of said first and respectively second coding algorithms depending at least on the results of said two step classification;
the coding phase generating, for each object, an object bit stream, containing all information relevant to a same concrete class for that audio signal in that frame, and a macro-object bit stream combining bit streams of different objects of a same to abstract class or different abstract classes and having bit-rate and bandwidth characteristics which depend on the choices made for said first and said second algorithms and on configuration information passed from a user equipment (US) to coding devices (AC) and/or on control information passed from a transmission system (SY) to the coding device;in that the method further comprises, between the coding and decoding phases, a phase of manipulation of the bit stream generated by said coding phase, for the scaling of the coded bit stream in dependence of information about the abstract and concrete classes, included in the coded bit stream, and of said configuration and control information; and in that in said decoding phase, said first decoding step is performed by means of a respective algorithm complementary to the second algorithm selected in the coding phase to generate the enhancement information block to be decoded in that step, and the second decoding step is performed according to an algorithm complementary to the first algorithm selected in the first coding step;
each of said first and second decoding algorithms being selected out of a plurality of first and second decoding algorithms, complementary each to one of said second and first coding algorithms, respectively, according to information provided with the abstract and concrete class and/or configuration information provided in a set up phase. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31)
-
-
32. Apparatus for processing digitized audio signals, comprising:
-
an encoder (AC) arranged to receive frames of samples of an audio signal to be coded, having given bandwidth characteristics, and comprising;
filtering means (FB1, FB2, FB3) for splitting the signal to be coded into a plurality of frequency bands, coding units (LCC, HCC, LEC, HEC) associated to each frequency band for the embedded coding of the signals of that band and comprising, for each band, a first coding unit (LCC, HCC), enabled for at least selected ones of the frequency bands and generating at each frame a core information for the respective band, and a set of second coding units (LCC, HCC), intended to generate a succession of enhancement information blocks for that band, the core information being the minimum amount of information needed for signal decoding; and
means (BCU) for combining coded signals of the different frequency bands into a single embedded coded signal which comprises a basic layer, containing the core information of said selected frequency bands and corresponding to a minimum bit rate, a total layer, containing the whole of the coded information and corresponding to a maximum bit rate, and a plurality of intermediate layers which contribute to the coded signal by respective enhancement information blocks and cause an increase of the bit rate of the coded signal by successive steps from the basic layer to the total layer, anda decoder (AD) comprising;
decoding units (LED, HED, LCD, HCD) for independently decoding the coded signal of the different frequency bands, and comprising, for each frequency band, a set of first decoding units (LED, HED), in one to one correspondence with the coding units of said second set (LEC, HEC) and intended each to decode an enhancement information block, and a second decoding unit (LCD, HCD) intended to decode the core information; and
synthesis filtering means (FB4, FB5, FB6) for recombining the decoded signals of the different frequency bands and reconstructing a decoded signal with bandwidth characteristics corresponding to that of the original audio signal;characterized in that the first coding unit (LCC, HCC) and each second coding unit (LEC, HEC) are configurable so as to apply to the signal being coded a first or respectively a second coding algorithm selected out of a plurality of first and second coding algorithms and each first decoding unit (LED, HED) and the second decoding unit (LCD, HCD) are configurable so as to apply to the signal being decoded a first or respectively a second decoding algorithm complementary to the second and the first coding algorithm, respectively, applied by the second and first coding units (LEC, HEC, LCC, HCC); and in that it, further comprises; a classification unit (CR) for submitting the audio signal to be coded to a two stage classification by which the signal is a given frame is allotted to one out of a plurality of abstract classes and to one out of a plurality of concrete classes of said one abstract class, the concrete classes being related with the characteristics of a signal portion and identifying elementary audio objects present in the frame and the abstract classes being related with the nature of to an audio signal and identifying macro-objects resulting from a combination of elementary audio objects;
the classification unit (CR) providing the information on the classification to the filtering means (FB1 . . . FB3) and to said first and second coding units (LCC, HCC, LEC, HEC) as control parameter for said splitting into frequency bands, the enabling of selected first and second coding units (LCC, HCC, LEC, HEC) and the selection of a proper coding algorithm by the or each coding unit, and to said combining means (BCU) for insertion into the coded bit stream; andat least one bit stream manipulation unit (BMU), located upstream the decoder (AD), for bit rate or bandwidth scaling of the coded signal relevant to individual macro-objects and/or objects. - View Dependent Claims (33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56)
-
Specification