Frame-based audio coding with video/audio data synchronization by dynamic audio frame alignment
First Claim
1. A method for signal processing comprising:
- receiving a first input signal comprising input samples representing audio information at an audio sample rate,receiving a second input signal comprising input frames conveying information at an input frame rate that are grouped in superframes, each superframe comprising a number of said input frames equal to a first number such that said audio sample rate divided by said input frame rate is not an integer but a product of said audio sample rate and said first number divided by said input frame rate is substantially equal to an integer,generating in response to said first input signal a sequence of audio frames, each audio frame corresponding to a respective input frame and comprising encoded audio information corresponding to a sequence of said input samples that includes an early start sample, a nominal start sample, and a number of subsequent samples equal to the integer portion of a quotient, said quotient equal to said audio sample rate divided by said input frame rate, wherein said early start sample is the first sample in said sequence of input samples and said nominal start sample is substantially aligned with said respective input frame, andgenerating an output signal arranged in output frames grouped into output superframes, each output superframe comprising a number of said output frames equal to said first number, a respective output frame comprising a respective audio frame and a label for said respective audio frame, wherein said label is unique for each audio frame in a respective output superframe.
1 Assignment
0 Petitions
Accused Products
Abstract
Several audio signal processing techniques may be used in various combinations to improve the quality of audio represented by an information stream formed by splice editing two or more other information streams. The techniques are particularly useful in applications that bundle audio information with video information. In one technique, gain-control words conveyed with the audio information stream are used to interpolate playback sound levels across a splice. In another technique, special filterbanks or forms of TDAC transforms are used to suppress aliasing artifacts on either side of a splice. In yet another technique, special filterbanks or crossfade window functions are used to optimize the attenuation of spectral splatter created at a splice. In a further technique, audio sample rates are converted according to frame lengths and rates to allow audio information to be bundled with, for example, video information. In yet a further technique, audio blocks are dynamically aligned so that proper synchronization can be maintained across a splice. An example for 48 kHz audio with NTSC video is discussed.
139 Citations
16 Claims
-
1. A method for signal processing comprising:
-
receiving a first input signal comprising input samples representing audio information at an audio sample rate, receiving a second input signal comprising input frames conveying information at an input frame rate that are grouped in superframes, each superframe comprising a number of said input frames equal to a first number such that said audio sample rate divided by said input frame rate is not an integer but a product of said audio sample rate and said first number divided by said input frame rate is substantially equal to an integer, generating in response to said first input signal a sequence of audio frames, each audio frame corresponding to a respective input frame and comprising encoded audio information corresponding to a sequence of said input samples that includes an early start sample, a nominal start sample, and a number of subsequent samples equal to the integer portion of a quotient, said quotient equal to said audio sample rate divided by said input frame rate, wherein said early start sample is the first sample in said sequence of input samples and said nominal start sample is substantially aligned with said respective input frame, and generating an output signal arranged in output frames grouped into output superframes, each output superframe comprising a number of said output frames equal to said first number, a respective output frame comprising a respective audio frame and a label for said respective audio frame, wherein said label is unique for each audio frame in a respective output superframe. - View Dependent Claims (2, 3, 4)
-
-
5. A method for signal processing comprising:
-
receiving an input signal arranged in input frames grouped into complete and partial input superframes, each complete input superframe having a number of said input frames equal to a first number that is greater than one and each partial input superframe having a lesser number of said input frames, each input frame comprising an audio frame representing encoded audio information at an input frame rate and a label associated with said audio frame, wherein said label is unique for each audio frame in a respective complete or partial input superframe, deriving sequences of samples from said audio frames, wherein a respective sequence of samples is derived from a respective audio frame and comprises an early start sample, a nominal start sample, and a number of subsequent samples equal to a second number, wherein said sequence of samples represents audio information at an audio sample rate and said second number is equal to the integer portion of a quotient, said quotient equal to said audio sample rate divided by said input frame rate, obtaining from each sequence of samples a respective subsequence of samples, wherein, in response to the label associated with the audio frame from which a respective sequence of samples is derived, the corresponding subsequence comprises a third number of samples and starts at either the early start sample, the nominal start sample, or the sample following the nominal start sample, wherein said third number is equal to either the second number or one plus the second number, and generating an output signal from an arrangement of the subsequences in which the start of each subsequence and the start of the immediately preceding subsequence are separated by said third number of samples of said preceding subsequence. - View Dependent Claims (6, 7, 8)
-
-
9. A device for signal processing comprising:
-
means for receiving a first input signal comprising input samples representing audio information at an audio sample rate, means for receiving a second input signal comprising input frames conveying information at an input frame rate that are grouped in superframes, each superframe comprising a number of said input frames equal to a first number such that said audio sample rate divided by said input frame rate is not an integer but a product of said audio sample rate and said first number divided by said input frame rate is substantially equal to an integer, means for generating in response to said first input signal a sequence of audio frames, each audio frame corresponding to a respective input frame and comprising encoded audio information corresponding to a sequence of said input samples that includes an early start sample, a nominal start sample, and a number of subsequent samples equal to the integer portion of a quotient, said quotient equal to said audio sample rate divided by said input frame rate, wherein said early start sample is the first sample in said sequence of input samples and said nominal start sample is substantially aligned with said respective input frame, and means for generating an output signal arranged in output frames grouped into output superframes, each output superframe comprising a number of said output frames equal to said first number, a respective output frame comprising a respective audio frame and a label for said respective audio frame, wherein said label is unique for each audio frame in a respective output superframe. - View Dependent Claims (10, 11, 12)
-
-
13. A device for signal processing comprising:
-
means for receiving an input signal arranged in input frames grouped into complete and partial input superframes, each complete input superframe having a number of said input frames equal to a first number that is greater than one and each partial input superframe having a lesser number of said input frames, each input frame comprising an audio frame representing encoded audio information at an input frame rate and a label associated with said audio frame, wherein said label is unique for each audio frame in a respective complete or partial input superframe, means for deriving sequences of samples from said audio frames, wherein a respective sequence of samples is derived from a respective audio frame and comprises an early start sample, a nominal start sample, and a number of subsequent samples equal to a second number, wherein said sequence of samples represents audio information at an audio sample rate and said second number is equal to the integer portion of a quotient, said quotient equal to said audio sample rate divided by said input frame rate, means for obtaining from each sequence of samples a respective subsequence of samples, wherein, in response to the label associated with the audio frame from which a respective sequence of samples is derived, the corresponding subsequence comprises a third number of samples and starts at either the early start sample, the nominal start sample, or the sample following the nominal start sample, wherein said third number is equal to either the second number or one plus the second number, and means for generating an output signal from an arrangement of the subsequences in which the start of each subsequence and the start of the immediately preceding subsequence are separated by said third number of samples of said preceding subsequence. - View Dependent Claims (14, 15, 16)
-
Specification