Synchronization of audio and video data in a wireless communication system
First Claim
1. A data stream synchronizer, comprising:
- a communication channel interface configured to receive a plurality of communication channel packets over a variable capacity communication channel via a wireless communication network, wherein the plurality of communication channel packets includes;
first blocks of communication channel packets, where each block in the first blocks of communication channel packets corresponds to a respective video frame that is encoded into an encoded video data stream, wherein the encoded video data stream is encoded from video frames of varying sizes, and wherein each block in the first blocks of communication channel packets occupies a period that is the same or less than a video frame period of the encoded video data stream irrespective of a size of the block based on a channel capacity of the variable capacity communication channel being dynamically varied to accommodate the size of the block; and
second blocks of communication channel packets, where each block in the second blocks of communication channel packets corresponds to a respective audio frame that is encoded into an encoded audio data stream;
a first decoder coupled to the communication channel interface to receive the first blocks of communication channel packets corresponding to the encoded video data stream and to output a decoded video data stream;
a second decoder coupled to the communication channel interface to receive the second blocks of communication channel packets corresponding to the encoded audio data stream and to output a decoded audio data stream;
a first buffer configured to accumulate the decoded video data stream and to output one frame of the decoded video data stream each video frame period;
a second buffer configured to accumulate the decoded audio data stream and to output one frame of the decoded audio data stream each audio frame period; and
a combiner configured to receive the one frame of the decoded video data stream and the one frame of the decoded audio data stream and to output a synchronized frame of audio/video data every video frame period, wherein the output synchronized frame of audio/video data includes only one frame of audio data per video frame period.
1 Assignment
0 Petitions
Accused Products
Abstract
Techniques are described for encoding an audio video stream that is transmitted over a network, for example a wireless or IP network, such that an entire frame of audio and an entire frame of video are transmitted simultaneously within a period required to render the audio video stream frames by an application in a receiver. Aspects of the techniques include receiving audio and video RTP streams and assigning an entire frame of RTP video data to communication channel packets that occupy the same period, or less, as the video frame rate. Also an entire frame of RTP audio data is assigned to communication channel packets that occupy the same period, or less, as the audio frame rate. The video and audio communication channel packets are transmitted simultaneously. Receiving and assigning RTP streams can be performed in a remote station, or a base station.
106 Citations
27 Claims
-
1. A data stream synchronizer, comprising:
-
a communication channel interface configured to receive a plurality of communication channel packets over a variable capacity communication channel via a wireless communication network, wherein the plurality of communication channel packets includes; first blocks of communication channel packets, where each block in the first blocks of communication channel packets corresponds to a respective video frame that is encoded into an encoded video data stream, wherein the encoded video data stream is encoded from video frames of varying sizes, and wherein each block in the first blocks of communication channel packets occupies a period that is the same or less than a video frame period of the encoded video data stream irrespective of a size of the block based on a channel capacity of the variable capacity communication channel being dynamically varied to accommodate the size of the block; and second blocks of communication channel packets, where each block in the second blocks of communication channel packets corresponds to a respective audio frame that is encoded into an encoded audio data stream; a first decoder coupled to the communication channel interface to receive the first blocks of communication channel packets corresponding to the encoded video data stream and to output a decoded video data stream; a second decoder coupled to the communication channel interface to receive the second blocks of communication channel packets corresponding to the encoded audio data stream and to output a decoded audio data stream; a first buffer configured to accumulate the decoded video data stream and to output one frame of the decoded video data stream each video frame period; a second buffer configured to accumulate the decoded audio data stream and to output one frame of the decoded audio data stream each audio frame period; and a combiner configured to receive the one frame of the decoded video data stream and the one frame of the decoded audio data stream and to output a synchronized frame of audio/video data every video frame period, wherein the output synchronized frame of audio/video data includes only one frame of audio data per video frame period. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A remote station apparatus, comprising:
-
a communication channel interface configured to receive a plurality of communication channel packets over a variable capacity communication channel via a wireless communication network, wherein the plurality of communication channel packets includes; first blocks of communication channel packets, where each block in the first blocks of communication channel packets corresponds to a respective video frame that is encoded into an encoded video data, wherein the encoded video data is encoded from video frames of varying sizes, and wherein each block in the first blocks of communication channel packets occupies a period that is the same or less than a video frame period of the encoded video data irrespective of a size of the block based on a channel capacity of the variable capacity communication channel being dynamically varied to accommodate the size of the block; and second blocks of communication channel packets, where each block in the second blocks of communication channel packets corresponds to a respective audio frame that is encoded into an encoded audio data; a video decoder coupled to the communication channel interface to receive the first blocks of communication channel packets corresponding to the encoded video data and to output decoded video data; an audio decoder coupled to the communication channel interface to receive the second blocks of communication channel packets corresponding to the encoded audio data and to output decoded audio data; a video buffer configured to accumulate the decoded video data for at least one video frame period and to output one frame of the decoded video data each video frame period; an audio buffer configured to accumulate the decoded audio data for multiple audio frame periods and to output one frame of the decoded audio data each audio frame period; and a combiner configured to receive the one frame of the decoded video data and the one frame of the decoded audio data and configured to output a synchronized frame of decoded audio/video data every video frame period, wherein the output synchronized frame of decoded audio/video data includes only one frame of audio data per video frame period. - View Dependent Claims (9, 10, 11)
-
-
12. A base station apparatus, comprising:
-
a communication channel interface configured to receive a plurality of communication channel packets over a variable capacity communication channel via a wireless communication network, wherein the plurality of communication channel packets includes; first blocks of communication channel packets, where each block in the first blocks of communication channel packets corresponds to a respective video frame that is encoded into an encoded video data, wherein the encoded video data is encoded from video frames of varying sizes, and wherein each block in the first blocks of communication channel packets occupies a period that is the same or less than a video frame period of the encoded video data irrespective of a size of the block based on a channel capacity of the variable capacity communication channel being dynamically varied to accommodate the size of the block; and second blocks of communication channel packets, where each block in the second blocks of communication channel packets corresponds to a respective audio frame that is encoded into an encoded audio data; a video decoder coupled to the communication channel interface to receive the first blocks of communication channel packets corresponding to the encoded video data and to output decoded video data; an audio decoder coupled to the communication channel interface to receive the second blocks of communication channel packets corresponding to the encoded audio data via the wireless communication network and to output decoded audio data; a video buffer configured to accumulate the decoded video data for a video frame period and to output one frame of the decoded video data during the video frame period; an audio buffer configured to accumulate the decoded audio data for an audio frame period and to output one frame of the decoded audio data during the audio frame period; and a combiner configured to receive the one frame of the decoded video data and the one frame of the decoded audio data and to output a synchronized frame of audio/video data every video frame period, wherein the output synchronized frame of audio/video data includes only one frame of audio data per video frame period. - View Dependent Claims (13, 14, 15)
-
-
16. A method for decoding and synchronizing data streams, comprising:
-
receiving a plurality of communication channel packets over a variable capacity communication channel via a wireless communication network, wherein the plurality of communication channel packets includes; first blocks of communication channel packets, where each block in the first blocks of communication channel packets corresponds to a respective video frame that is encoded into an encoded video data stream, wherein the encoded video data stream is encoded from video frames of varying sizes, and wherein each block in the first blocks of communication channel packets occupies a period that is the same or less than a video frame period of the encoded video data stream irrespective of a size of the block based on a channel capacity of the variable capacity communication channel being dynamically varied to accommodate the size of the block; and second blocks of communication channel packets, where each block in the second blocks of communication channel packets corresponds to a respective audio frame that is encoded into an encoded audio data stream; decoding the encoded video data stream, and outputting a decoded video data stream; decoding the encoded audio data stream, and outputting a decoded audio data stream; accumulating the decoded video data stream and outputting one frame of the decoded video data stream each video frame period; accumulating the decoded audio data stream and outputting one frame of the decoded audio data stream each audio frame period; and combining the one frame of the decoded video data stream with the one frame of the decoded audio data stream and outputting a synchronized frame of audio/video data every video frame period, wherein the output synchronized frame of audio/video data includes only one frame of audio data per video frame period.
-
-
17. A method for decoding and synchronizing audio and video data, comprising:
-
receiving a plurality of communication channel packets over a variable capacity communication channel via a wireless communication network, wherein the plurality of communication channel packets includes; first blocks of communication channel packets, where each block in the first blocks of communication channel packets corresponds to a respective video frame that is encoded into an encoded video data, wherein the encoded video data is encoded from video frames of varying sizes, and wherein each block in the first blocks of communication channel packets occupies a period that is the same or less than a video frame period of the encoded video data irrespective of a size of the block based on a channel capacity of the variable capacity communication channel being dynamically varied to accommodate the size of the block; and second blocks of communication channel packets, where each block in the second blocks of communication channel packets corresponds to a respective audio frame that is encoded into an encoded audio data; outputting decoded video data in response to the encoded video data; outputting decoded audio data in response to the encoded audio data; accumulating the decoded video data for a video frame period and outputting one frame of the decoded video data each video frame period; accumulating the decoded audio data for an audio frame period and outputting one frame of the decoded audio data each audio frame period; and combining the one frame of the decoded video data with the one frame of the decoded audio data and outputting a synchronized frame of decoded audio/video data every video frame period, wherein the output synchronized frame of decoded audio/video data includes only one frame of audio data per video frame period.
-
-
18. A non-transitory computer-readable media, comprising instructions stored thereon that, if executed by a processor, cause the processor to control execution of a method for decoding and synchronizing data streams, the method comprising:
-
receiving a plurality of communication channel packets over a variable capacity communication channel via a wireless communication network, wherein the plurality of communication channel packets includes; first blocks of communication channel packets, where each block in the first blocks of communication channel packets corresponds to a respective video frame that is encoded into an encoded video data stream, wherein the encoded video data stream includes is encoded from video frames of varying sizes, and wherein each block in the first blocks of communication channel packets occupies a period that is the same or less than a video frame period of the encoded video data stream irrespective of a size of the block based on a channel capacity of the variable capacity communication channel being dynamically varied to accommodate the size of the block; and second blocks of communication channel packets, where each block in the second blocks of communication channel packets corresponds to a respective audio frame that is encoded into an encoded audio data stream; decoding the encoded video data stream, and outputting a decoded video data stream; decoding the encoded audio data stream, and outputting a decoded audio data stream; accumulating the decoded video data stream and outputting one frame of the decoded video data stream each video frame period; accumulating the decoded audio data stream and outputting one frame of the decoded audio data stream each audio frame period; and combining the one frame of the decoded video data stream with the one frame of the decoded audio data stream and outputting a synchronized frame of audio/video data every video frame period, wherein the output synchronized frame of audio/video data includes only one frame of audio data per video frame period. - View Dependent Claims (19)
-
-
20. A non-transitory computer-readable media, comprising instructions stored thereon that, if executed by a processor, cause the processor to control execution of a method for decoding and synchronizing audio and video data, the method comprising:
-
receiving a plurality of communication channel packets over a variable capacity communication channel via a wireless communication network, wherein the plurality of communication channel packets includes; first blocks of communication channel packets, where each block in the first blocks of communication channel packets corresponds to a respective video frame that is encoded into an encoded video data stream, wherein the encoded video data stream is encoded from video frames of varying sizes, and wherein each block in the first blocks of communication channel packets occupies a period that is the same or less than a video frame period of the encoded video data stream irrespective of a size of the block based on a channel capacity of the variable capacity communication channel being dynamically varied to accommodate the size of the block; and second blocks of communication channel packets, where each block in the second blocks of communication channel packets corresponds to a respective audio frame that is encoded into an encoded audio data stream; and outputting decoded video data in response to the encoded video data stream; receiving encoded audio data via the wireless communication network and outputting decoded audio data; accumulating the decoded video data for a video frame period and outputting one frame of the decoded video data each video frame period; accumulating the decoded audio data for an audio frame period and outputting one frame of the decoded audio data each audio frame period; and combining the one frame of the decoded video data with the one frame of the decoded audio data and outputting a synchronized frame of decoded audio/video data every video frame period, wherein the output synchronized frame of decoded audio/video data includes only one frame of audio data per video frame period. - View Dependent Claims (21)
-
-
22. A data stream synchronizer, comprising:
-
means for receiving a plurality of communication channel packets over a variable capacity communication channel via a wireless communication network, wherein the plurality of communication channel packets includes; first blocks of communication channel packets, where each block in the first blocks of communication channel packets corresponds to a respective video frame that is encoded into an encoded video data stream, wherein the encoded video data stream is encoded from video frames of varying sizes, and wherein each block in the first blocks of communication channel packets occupies a period that is the same or less than a video frame period of the encoded video data stream irrespective of a size of the block based on a channel capacity of the variable capacity communication channel being dynamically varied to accommodate the size of the block; and second blocks of communication channel packets, where each block in the second blocks of communication channel packets corresponds to a respective audio frame that is encoded into an encoded audio data stream; means for decoding the encoded video data stream and to output a decoded video data stream; means for decoding the encoded audio data stream and to output a decoded audio data stream; means for accumulating the decoded video data stream and to output one frame of the decoded video data stream each video frame period; means for accumulating the decoded audio data stream and to output one frame of the decoded audio data stream each audio frame period; means for buffering the frames of the decoded audio and video data streams; and means for combining the one frame of the decoded video data stream with the one frame of the decoded audio data stream and to output a synchronized frame of audio/video data every video frame period, wherein the output synchronized frame of audio/video data includes only one frame of audio data per video frame period.
-
-
23. A remote station apparatus, comprising:
-
means for receiving a plurality of communication channel packets over a variable capacity communication channel via a wireless communication network, wherein the plurality of communication channel packets includes; first blocks of communication channel packets, where each block in the first blocks of communication channel packets corresponds to a respective video frame that is encoded into an encoded video data, wherein the encoded video data is encoded from video frames of varying sizes, and wherein each block in the first blocks of communication channel packets occupies a period that is the same or less than a video frame period of the encoded video data irrespective of a size of the block based on a channel capacity of the variable capacity communication channel being dynamically varied to accommodate the size of the block; and second blocks of communication channel packets, where each block in the second blocks of communication channel packets corresponds to a respective audio frame that is encoded into an encoded audio data; means for outputting decoded video data in response to the encoded video data; means for outputting decoded audio data in response to the encoded audio data; means for accumulating the decoded video data for a video frame period and outputting one frame of the decoded video data each video frame period; means for accumulating the decoded audio data for an audio frame period and outputting one frame of the decoded audio data each audio frame period; means for buffering the frames of the decoded audio and video data; and means for combining the one frame of the decoded video data with the one frame of the decoded audio data and outputting a synchronized frame of audio/video data every video frame period, wherein the output synchronized frame of audio/video data includes only one frame of audio data per video frame period.
-
-
24. A base station apparatus, comprising:
-
means for receiving a plurality of communication channel packets over a variable capacity communication channel via a wireless communication network, wherein the plurality of communication channel packets includes; first blocks of communication channel packets, where each block in the first blocks of communication channel packets corresponds to a respective video frame that is encoded into an encoded video data, wherein the encoded video data is encoded from video frames of varying sizes, and wherein each block in the first blocks of communication channel packets occupies a period that is the same or less than a video frame period of the encoded video data irrespective of a size of the block based on a channel capacity of the variable capacity communication channel being dynamically varied to accommodate the size of the block; and second blocks of communication channel packets, where each block in the second blocks of communication channel packets corresponds to a respective audio frame that is encoded into an encoded audio data; means for outputting decoded video data in response to the encoded video data; means for outputting decoded audio data in response to the encoded audio data; means for accumulating the decoded video data for a video frame period and outputting one frame of the decoded video data each video frame period; means for accumulating the decoded audio data for an audio frame period and outputting one frame of the decoded audio data each audio frame period; means for buffering the decoded audio and video data; and means for combining the one frame of the decoded video data with the one frame of the decoded audio data and outputting a synchronized frame of audio/video data every video frame period, wherein the output synchronized frame of audio/video data includes only one frame of audio data per video frame period.
-
-
25. A method for decoding and synchronizing data streams, comprising:
-
receiving a plurality of communication channel packets over a variable capacity communication channel via a wireless communication network, wherein the plurality of communication channel packets includes; first blocks of communication channel packets, where each block in the first blocks of communication channel packets corresponds to a respective video frame that is encoded into an encoded video data stream, wherein the encoded video data stream is encoded from video frames of varying sizes, and wherein each block in the first blocks of communication channel packets occupies a period that is the same or less than a video frame period of the encoded video data stream irrespective of a size of the block based on a channel capacity of the variable capacity communication channel being dynamically varied to accommodate the size of the block; and second blocks of communication channel packets, where each block in the second blocks of communication channel packets corresponds to a respective audio frame that is encoded into an encoded audio data stream; decoding the encoded video data stream into a decoded video data stream; decoding an encoded audio data stream received via the wireless communication network into a decoded audio data stream; accumulating the decoded video data stream and outputting one frame of the decoded video data stream each video frame period; accumulating the decoded audio data stream and outputting one frame of the decoded audio data stream each audio frame period; and combining the one frame of the decoded video data stream with the one frame of the decoded audio data stream and outputting a synchronized frame of audio/video data every video frame period, wherein the output synchronized frame of audio/video data includes only one frame of audio data per video frame period.
-
-
26. A non-transitory computer-readable media, comprising instructions stored thereon that, if executed by a processor, cause the processor to control execution of a method for decoding and synchronizing data streams, the method comprising:
-
receiving a plurality of communication channel packets over a variable capacity communication channel via a wireless communication network, wherein the plurality of communication channel packets includes; first blocks of communication channel packets, where each block in the first blocks of communication channel packets corresponds to a respective video frame that is encoded into an encoded video data stream, wherein the encoded video data stream is encoded from video frames of varying sizes, and wherein each block in the first blocks of communication channel packets occupies a period that is the same or less than a video frame period of the encoded video data stream irrespective of a size of the block based on a channel capacity of the variable capacity communication channel being dynamically varied to accommodate the size of the block; and second blocks of communication channel packets, where each block in the second blocks of communication channel packets corresponds to a respective audio frame that is encoded into an encoded audio data stream; decoding the encoded video data stream into a decoded video data stream; decoding the encoded audio data stream into a decoded audio data stream; accumulating the decoded video data stream and outputting one frame of the decoded video data stream each video frame period; accumulating the decoded audio data stream and outputting one frame of the decoded audio data stream each audio frame period; and combining the one frame of the decoded video data stream with the one frame of the decoded audio data stream and outputting a synchronized frame of audio/video data every video frame period, wherein the output synchronized frame of audio/video data includes only one frame of audio data per video frame period.
-
-
27. A data stream synchronizer, comprising:
-
means for receiving a plurality of communication channel packets over a variable capacity communication channel via a wireless communication network, wherein the plurality of communication channel packets includes; first blocks of communication channel packets, where each block in the first blocks of communication channel packets corresponds to a respective video frame that is encoded into an encoded video data stream, wherein the encoded video data stream is encoded from video frames of varying sizes, and wherein each block in the first blocks of communication channel packets occupies a period that is the same or less than a video frame period of the encoded video data stream irrespective of a size of the block based on a channel capacity of the variable capacity communication channel being dynamically varied to accommodate the size of the block; and second blocks of communication channel packets, where each block in the second blocks of communication channel packets corresponds to a respective audio frame that is encoded into an encoded audio data stream; means for decoding the encoded video data stream into a decoded video data stream; means for decoding the encoded audio data stream into a decoded audio data stream; means for accumulating the decoded video data stream and outputting one frame of the decoded video data stream each video frame period; means for accumulating the decoded audio data stream and outputting one frame of the decoded audio data stream each audio frame period; means for buffering the one frame of the decoded video data stream and the one frame of the decoded audio data stream, wherein the buffering means are sized at least partially based on a maximum delay experienced during transmission of the encoded video and audio data streams; and means for combining the one frame of the decoded video data stream with the one frame of the decoded audio data stream and for outputting a synchronized frame of audio/video data every video frame period, wherein the output synchronized frame of audio/video data includes only one frame of audio data per video frame period.
-
Specification