System and method for cross-fading between audio streams
First Claim
1. A method comprising:
- receiving, within a first stream via a network communication link, first audio data generated by sampling of a common audio signal of an audio signal at a first sampling rate for a first time period;
receiving thereafter, based at least in part upon a change in a bandwidth capability of the network communication link, second audio data within a second stream generated by sampling of said audio source at a second sampling rate different than said first sampling rate for a second time period, the first and second audio data corresponding to different, but overlapping, portions of the common audio signal;
generating a plurality of samples by normalizing a portion of said first audio data to said second sampling rate, said portion of said first audio data being normalized at least in part corresponding to the overlapping portion of said common audio signal sampled at said first sampling rate;
cross-fading and combining pairs of samples, each pair substantially corresponding to a playback time, one sample of each pair being selected from one of said plurality of samples, the other sample of each pair being selected from a portion of said second audio data, said portion of said second audio data being selected at least in part corresponding to said overlapping portion of said common audio signal sampled at said second sampling rate; and
rendering said cross-faded and combined samples.
2 Assignments
0 Petitions
Accused Products
Abstract
A system and method of the present invention cross-fade a first transmitted audio stream to a second transmitted audio stream, wherein both first and second audio streams represent the same original audio signal, but at different quality levels. A client computer receives timestamped packets of compressed encoded audio data from the first audio stream, decodes that data and resamples it to a highest sampling rate supported by playback equipment such as a sound card. A server computer responds to a change in available bandwidth, by transmitting timestamped packets of the second audio stream which correspond to a playback time earlier than that of the final transmitted packet of the first audio stream. The client computer buffers in a first buffer the decoded resampled samples from the final packets of the first audio stream, which represent a playback time period t1. The client computer then buffers in a second buffer decoded resampled samples from the initial packets of the second audio stream representing a playback time period t2. A cross-fade overlap window is defined by a time period t3 over which t1 and t2 overlap. A cross-fader cross-fades sample pairs drawn from both buffers, each pair corresponding to a playback time in the cross-fade overlap window. A cross-fade table holds a predetermined number of values decreasing from 1 to 0, which values approximate a cross-fade curve. The cross-fader applies a weight value to each sample pair, the weight value calculated by applying linear interpolation across adjacent values in the cross-fade table, by multiplying a sample from the first audio stream by the weight value, and by multiplying a time-corresponding sample from the second audio stream by one minus the weight value. The resulting contributions from both samples are combined and sent to audio reproduction equipment.
79 Citations
27 Claims
-
1. A method comprising:
-
receiving, within a first stream via a network communication link, first audio data generated by sampling of a common audio signal of an audio signal at a first sampling rate for a first time period; receiving thereafter, based at least in part upon a change in a bandwidth capability of the network communication link, second audio data within a second stream generated by sampling of said audio source at a second sampling rate different than said first sampling rate for a second time period, the first and second audio data corresponding to different, but overlapping, portions of the common audio signal; generating a plurality of samples by normalizing a portion of said first audio data to said second sampling rate, said portion of said first audio data being normalized at least in part corresponding to the overlapping portion of said common audio signal sampled at said first sampling rate; cross-fading and combining pairs of samples, each pair substantially corresponding to a playback time, one sample of each pair being selected from one of said plurality of samples, the other sample of each pair being selected from a portion of said second audio data, said portion of said second audio data being selected at least in part corresponding to said overlapping portion of said common audio signal sampled at said second sampling rate; and rendering said cross-faded and combined samples. - View Dependent Claims (2, 3)
-
-
4. A method comprising:
-
receiving in a receive buffer via a network communication link first audio data of a first data stream, the first audio data representing a time period t1 and sampled at a first target sampling rate of an original audio signal; decoding said first audio data and re-sampling the decoded first audio data to generate first audio samples as the first audio data are received, initially into an audio output buffer, and in response to an indication of a change in a data capacity of the network communication link, into an old stream buffer instead; receiving thereafter in said receive buffer, second audio data from the second data stream representing a time period t2 of said original audio signal and sampled at a second target sampling rate different from said first target sampling rate, said time period t1 and t2 overlapping by a time period t3 in said original audio signal; decoding said second audio data and re-sampling the decoded second audio data to generate second audio samples as the second audio data are received, initially into a new stream buffer, for at least a portion of the time period t3; cross-fading each sample pair comprising corresponding sample pairs corresponding to a time within said at least a portion of the time period t3 from said old and new stream buffers, by applying a first cross-fade weight to a first sample of said sample pair to obtain a first contribution, a second cross-fade weight to a second sample of said sample pair to obtain a second contribution, and by combining said first and second contributions, to successively generate a plurality of cross-faded combined samples; and outputting successively the generated cross-faded combined samples into the audio output buffer. - View Dependent Claims (5, 6, 7)
-
-
8. A system comprising:
-
a receive buffer to successively receive and store a first and a second stream of an audio signal transmitted via a network, the second stream being received based at least in part upon a change in conditions of a network, the first and second streams respectively correspond to first and second portions of the audio signal, and the first and second portions audio signal overlap; a first and a second decoder coupled with the receive buffer to respectively decode the first and second received streams; a sample-rate converter coupled with the first and second decoders to resample the decoded first and second received streams adapted to generate a first and second plurality of digital samples respectively; an old stream buffer coupled with the sample-rate converter to receive the first digital samples, after an initial time period, and at substantially a beginning of the receipt of a portion of the first stream corresponding to the overlap of the first and second portions of the audio signal; a new stream buffer coupled with the sample-rate converter to receive the second digital samples for at least a portion of the second stream corresponding to the overlap of the first and second portions of the audio signal; a cross-fader coupled with the old and new stream buffers to cross-fade and combine corresponding ones of the first and second digital samples; and a renderer to render the cross-faded and combined digital samples. - View Dependent Claims (9, 10, 11)
-
-
12. A method comprising:
-
receiving, via a network communication link, first audio data within a first stream of an audio signal for a first time period; receiving thereafter, via the network communication link, second audio data within a second stream of the audio signal for a second time period, said second audio data being received in response to a change in bandwidth capability of the network communication link, and having a common portion of the audio signal that is also a part of the first audio data; decoding said first and second audio data as they are received; generating successively pairs of samples of said first and second audio data for at least a portion of the common portion of the audio signal that is a part of the first audio data and a part of the second audio data, each pair substantially corresponding to a playback time, one sample of each pair being selected from said first decoded audio data, said other sample of each pair being selected from said second decoded audio data; cross-fading and combining successively said successively generated pairs of samples; and successively rendering the cross-faded and combined samples. - View Dependent Claims (13, 14, 15, 16)
-
-
17. A computer readable media having a set of instructions adapted to enable a processing system to practice a method including:
-
receiving via a communication link first audio data within a first audio stream of an audio signal for a first time period; receiving thereafter, a second audio data within a second stream of the audio signal, via the communication link, for a second time period, in response to a change in bandwidth capability of the network communication link; decoding said first and second audio data, the first and second data, both having a common portion of the audio signal; and generating pairs of samples of said first and second audio data for at least a portion of the common portion of the audio signal, each pair substantially corresponding to a playback time, one sample of each pair being selected from a portion of said first decoded audio data, said other sample of each pair being selected from a portion of said second decoded audio data; cross-fading to combine said pairs of samples; and rendering the cross-faded and combined samples. - View Dependent Claims (18, 19)
-
-
20. A method comprising
streaming first audio data to a remote rendering client device for a first period of time, the first audio data having been generated by sampling a first portion of an audio signal at a first sampling rate; -
detecting a change in operating condition; and streaming second audio data to the remote rendering client device for a second period of time, the second audio data having been generated by sampling a second portion of the audio signal at a second sampling rate, the first and second portions of the audio signal having a common portion of the audio signal. - View Dependent Claims (21, 22, 23)
-
-
24. An apparatus comprising
streaming means for streaming audio data generated from sampling an audio signal at a sampling rate to a remote rendering client device; - and
control means for first controlling the streaming means to stream first audio data to the remote rendering client device for a first period of time, the first audio data having been generated by sampling a first portion of the audio signal at a first sampling rate, and then on detecting a change in operating condition, controlling the streaming means to stream thereafter, second audio data to the remote rendering client device for a second period of time, the second audio data having been generated by sampling a second portion of the audio signal at a second sampling rate, the first and second portions of the audio signal having a common portion of the audio signal. - View Dependent Claims (25, 26, 27)
- and
Specification