Server-assisted video conversation
First Claim
1. A computer implemented method performed by a first mobile computing device for creating a two-way audio-video conversation between the first mobile computing device and a second mobile computing device, the method comprising:
- recording an audio data stream;
encoding the audio data stream, the audio data stream comprising a stream of audio packets, each audio packet comprising an audio timestamp;
receiving a video data stream;
encoding the video data stream, the video data stream comprising a stream of video packets, each video packet comprising a video timestamp matching a corresponding audio timestamp and audio packet that was recorded concurrently with the video packet;
offloading to a centralized server processing of the audio data stream and video data stream for the two-way audio-video conversation between the first mobile computing device and the second mobile computing device by;
splitting the audio data stream and video data streams into separate data streams for communication to the centralized server;
transmitting the audio data stream over a first transport protocol to the centralized server; and
transmitting the video data stream separately from the audio data stream over a second transport protocol to the centralized server, the centralized server configured to re-encode the audio data stream and the video data stream based on characteristics of the second mobile computing device and deliver both the re-encoded audio data stream and the re-encoded video data stream to the second mobile computing device for synchronizing based on the respective audio and video timestamps.
2 Assignments
0 Petitions
Accused Products
Abstract
A method, computer program product, and system provide real time, two way audio-video conversation between mobile computing devices. Low latency is achieved by splitting the audio and video data streams from a given audio-video conversation using two different transport protocols to send the separate streams over a network, and re-syncing them at the other end. The transmission for each stream is tuned based on feedback data indicating available bandwidth of the network or other mobile computing device. A server offloads processing requirements that would otherwise be handled by the mobile computing device. The two way conversation can be externally observed by web-based users. The system functions over a disparate set of mobile computing device endpoints and web-based endpoints, and over different wireless carrier network infrastructures.
40 Citations
36 Claims
-
1. A computer implemented method performed by a first mobile computing device for creating a two-way audio-video conversation between the first mobile computing device and a second mobile computing device, the method comprising:
-
recording an audio data stream; encoding the audio data stream, the audio data stream comprising a stream of audio packets, each audio packet comprising an audio timestamp; receiving a video data stream; encoding the video data stream, the video data stream comprising a stream of video packets, each video packet comprising a video timestamp matching a corresponding audio timestamp and audio packet that was recorded concurrently with the video packet; offloading to a centralized server processing of the audio data stream and video data stream for the two-way audio-video conversation between the first mobile computing device and the second mobile computing device by; splitting the audio data stream and video data streams into separate data streams for communication to the centralized server; transmitting the audio data stream over a first transport protocol to the centralized server; and transmitting the video data stream separately from the audio data stream over a second transport protocol to the centralized server, the centralized server configured to re-encode the audio data stream and the video data stream based on characteristics of the second mobile computing device and deliver both the re-encoded audio data stream and the re-encoded video data stream to the second mobile computing device for synchronizing based on the respective audio and video timestamps. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A computer implemented method performed by a centralized server for creating a two-way audio-video conversation between a first mobile computing device and a second mobile computing device, the method comprising:
-
performing operations to handle processing tasks for real-time streaming of data for the two-way audio-video conversation at the centralized server on behalf of the first mobile computing device and the second mobile computing device including; receiving an audio data stream over a first transport protocol from the first mobile computing device, the audio data stream encoded with a first audio codec; receiving a video data stream communicated separately over a transmission control protocol from the first mobile computing device, the video data stream encoded with a first video codec, the transmission control protocol being different from the first transport protocol; receiving codec data from the second mobile computing device, the codec data comprising a list of codecs installed on the second mobile computing device, the list of codecs comprising a second audio codec and a second video codec; determining whether the list of codecs includes the first audio codec; responsive to determining that the list of codecs does not include the first audio codec, transcoding the audio stream using the second audio codec; determining whether the list of codecs includes the first video codec; responsive to determining that the list of codecs does not include the first video codec, transcoding the video stream using the second video codec; determining, by a bitrate adaptation module of the centralized server, whether to drop one or more frames from the received video data stream based, at least in part, on a bit rate limitation of the second mobile computing device and thereby cause lowering of processing requirements of the second mobile computing device relative to not dropping the one or more frames; transmitting the audio data stream to the second mobile computing device over the first network protocol; and transmitting the video data stream separately from the audio data stream to the second mobile computing device over the second network protocol responsive to determining whether to drop the one or more frames from the received video data stream. - View Dependent Claims (9, 10)
-
-
11. A computer implemented method performed by a centralized server for creating a two-way audio-video conversation between a first mobile computing device and a second mobile computing device, the method comprising:
-
performing operations for the two-way audio-video conversation at the centralized server on behalf of the first mobile computing device and the second mobile computing device including; receiving an encoded audio data stream over a first transport protocol from the first mobile computing device, the encoded audio data stream comprising a stream of audio packets and an audio bit rate; receiving an encoded video data stream communicated separately over a second transport protocol from the first mobile computing device, the encoded video data stream comprising a stream of video packets and a video bit rate; receiving feedback data from the second mobile computing device, the feedback data comprising a network bandwidth and a processing bandwidth; determining whether the sum of the audio bit rate and the video bit rate exceeds either the network bandwidth or the processing bandwidth; responsive to determining that the sum of the audio bit rate and the video bit rate exceeds either the network bandwidth or the processing bandwidth, reducing, at the centralized server, the video bit rate of the encoded video data stream received from the first mobile computing device below the network bandwidth and the processing bandwidth; transmitting the encoded audio data stream to the second mobile computing device over the first network protocol; and transmitting the encoded video data stream to the second mobile computing device over the second network protocol. - View Dependent Claims (12, 13, 14, 15)
-
-
16. A computer implemented method performed by a second mobile computing device for creating a two-way audio-video conversation between a first mobile computing device and the second mobile computing device, the method comprising:
-
receiving an audio data stream over a first transport protocol from a centralized server, the audio data stream comprising a stream of audio packets, each packet comprising an audio timestamp; receiving a video data stream over a second transport protocol communicated separately from the centralized server, the video data stream comprising a stream of video packets, each packet comprising a video timestamp, the audio data stream and video data stream being maintained as split streams throughout communication from the first mobile computing device to the second mobile computing device and processed by the centralized server to offload processing operations for the two-way audio-video conversation to reduce latency; buffering the audio and video data streams in a buffer; synching the audio data stream with the video data stream, the synching comprising matching each audio timestamp and audio packet with a video timestamp and video packet; and if a matching video timestamp is present in the buffer; outputting the synched audio data stream through an audio subsystem, the audio subsystem of the second mobile computing device being configured to decode the synched audio data stream; and outputting the synched video data stream through a video subsystem concurrently with outputting the synched audio data stream, the video subsystem of the second mobile computing device being configured to decode the synched video data stream. - View Dependent Claims (17, 18)
-
-
19. Apparatus at a first mobile computing device for creating a two-way audio-video conversation between the first mobile computing device and a second mobile computing device, the apparatus comprising:
-
an audio subsystem for recording an audio data stream and for encoding the audio data stream, the audio data stream comprising a stream of audio packets; a video subsystem for receiving a video data stream and for encoding the video data stream, the video data stream comprising a stream of video packets; components to maintain the audio data stream and video data streams for the two-way audio-video conversation as separate data streams and offload processing of the separate data streams to a centralized server for reduced latency, including; a restore module for adding timestamps to the audio and video packets of the separate data streams, each audio packet comprising an audio timestamp and each video packet comprising a video timestamp matching a corresponding audio timestamp and audio packet that was recorded concurrently with the video packet; an audio output for transmitting the audio data stream over a first transport protocol to the centralized server as one of said separate data streams; and a video output for transmitting the video data stream separately from the audio data stream over a second transport protocol to the centralized server as another one of said separate data streams, the centralized server configured to re-encode the audio data stream or the video data stream based on characteristics of the second mobile computing device, process both the audio data stream and the video data stream as separate data streams, and deliver the separate data streams to the second mobile computing device for synchronizing based on the respective audio and video timestamps. - View Dependent Claims (20, 21, 22, 23, 24)
-
-
25. Apparatus at a centralized server for creating a two-way audio-video conversation between a first mobile computing device and a second mobile computing device, the apparatus comprising:
-
means for performing operations for the two-way audio-video conversation on behalf of the first mobile computing device and the second mobile computing device including; means for receiving an audio data stream over a first transport protocol from the first mobile computing device, the audio data stream encoded with a first audio codec; means for receiving a video data stream separately from the audio data stream over a transmission control protocol from the first mobile computing device, the video data stream encoded with a first video codec; means for receiving codec data from the second mobile computing device, the codec data comprising a list of codecs installed on the second mobile computing device, the list of codecs comprising a second audio codec and a second video codec; means for determining whether the list of codecs includes the first audio codec; means for transcoding the audio stream using the second audio codec, responsive to determining that the list of codecs does not include the first audio codec; means for determining whether the list of codecs includes the first video codec; means for transcoding the video stream using the second video codec, responsive to determining that the list of codecs does not include the first video codec; means for determining, by a bitrate adaptation module of the centralized server, whether to drop one or more frames from the received video data stream based, at least in part, on a bit rate limitation of the second mobile computing device and thereby cause lowering of processing requirements of the second mobile computing device relative to not dropping the one or more frames; means for transmitting the audio data stream to the second mobile computing device over the first network protocol; and
means for transmitting the video data stream separately from the audio data stream to the second mobile computing device over the second network protocol responsive to determining whether to drop the one or more frames from the received video data stream. - View Dependent Claims (26)
-
-
27. Apparatus at a centralized server for creating a two-way audio-video conversation between a first mobile computing device and a second mobile computing device, the apparatus comprising:
-
means for performing operations for the two-way audio-video conversation at the centralized server on behalf of the first mobile computing device and the second mobile computing device including; means for receiving an encoded audio data stream over a first transport protocol from the first mobile computing device, the encoded audio data stream comprising a stream of audio packets and an audio bit rate; means for receiving an encoded video data stream communicated separately from the audio data stream over a second transport protocol from the first mobile computing device, the encoded video data stream comprising a stream of video packets and a video bit rate; means for receiving feedback data from the second mobile computing device, the feedback data comprising a network bandwidth and a processing bandwidth; means for determining whether the sum of the audio bit rate and the video bit rate exceeds either the network bandwidth or the processing bandwidth; means for reducing, at the centralized server, the video bit rate of the encoded video data stream received from the first mobile computing device below the network bandwidth and the processing bandwidth, responsive to determining that the sum of the audio bit rate and the video bit rate exceeds either the network bandwidth or the processing bandwidth; means for transmitting the encoded audio data stream to the second mobile computing device over the first network protocol; and
means for transmitting the encoded video data stream to the second mobile computing device over the second network protocol separately from the audio data stream. - View Dependent Claims (28, 29, 30)
-
-
31. Apparatus at a second mobile computing device for creating a two-way audio-video conversation between a first mobile computing device and the second mobile computing device, the apparatus comprising:
-
an audio input for receiving an audio data stream from a centralized server over a first transport protocol, the audio data stream comprising a stream of audio packets, each packet comprising an audio timestamp; a video input for receiving a video data stream communicated separately from the centralized server over a second transport protocol, the video data stream comprising a stream of video packets, each packet comprising a video timestamp, the audio data stream and video data stream being received as split streams that are maintained as split streams throughout communication from the first mobile computing device to the second mobile computing device and processed by the centralized server to offload processing operations for the two-way audio-video conversation for reduced latency; a restore module for buffering the audio and video data streams in a buffer and synching the audio data stream with the video data stream, the synching comprising matching each audio timestamp and audio packet with a video timestamp and video packet; an audio subsystem for outputting the synched audio data stream if a matching video timestamp is present in the buffer, the audio subsystem of the second mobile computing device being configured to decode the synched audio data stream; and a video subsystem for outputting the synched video data stream concurrently with outputting the synched audio data stream if a matching video timestamp is present in the buffer, the video subsystem of the second mobile computing device being configured to decode the synched video data stream. - View Dependent Claims (32)
-
-
33. A computer-readable storage device with an executable program stored thereon, wherein the program instructs a microprocessor to perform the following steps at a first mobile computing device for creating a two-way audio-video conversation between the first mobile computing device and a second mobile computing device, the steps comprising:
-
recording an audio data stream; encoding the audio data stream, the audio data stream comprising a stream of audio packets, each audio packet comprising an audio timestamp; receiving a video data stream; encoding the video data stream, the video data stream comprising a stream of video packets, each video packet comprising a video timestamp matching a corresponding audio timestamp and audio packet that was recorded concurrently with the video packet; offloading to a centralized server processing of the audio data stream and video data stream for the two-way audio-video conversation between the first mobile computing device and the second mobile computing device by; splitting the audio data stream and video data streams into separate data streams for communication to the centralized server; transmitting the audio data stream over a first transport protocol to the centralized server; and transmitting the video data stream over a second transport protocol to the centralized server, the video data stream and audio data stream maintained as separate streams during transmission to and from the centralized server, the centralized server configured to re-encode the audio data stream or the video data stream based on characteristics of the second mobile computing device and deliver both the audio data stream and the video data stream to the second mobile computing device for synchronizing the audio data stream and the video data stream based on the respective audio and video timestamps.
-
-
34. A computer-readable storage device with an executable program stored thereon, wherein the program instructs a microprocessor to perform the following steps for creating a two-way audio-video conversation between a first mobile computing device and a second mobile computing device, the steps comprising:
-
performing operations for the two-way audio-video conversation at a centralized server on behalf of the first mobile computing device and the second mobile computing device including; receiving an audio data stream over a first transport protocol from the first mobile computing device, the audio data stream encoded with a first audio codec; receiving a video data stream over a transmission control protocol from the first mobile computing device, the video data stream encoded with a first video codec; receiving codec data from the second mobile computing device, the codec data comprising a list of codecs installed on the second mobile computing device, the list of codecs comprising a second audio codec and a second video codec; determining whether the list of codecs includes the first audio codec; responsive to determining that the list of codecs does not include the first audio codec, transcoding the audio stream using the second audio codec; determining whether the list of codecs includes the first video codec; responsive to determining that the list of codecs does not include the first video codec, transcoding the video stream using the second video codec; determining, by a bitrate adaptation module of the centralized server, whether to drop one or more frames from the received video data stream based, at least in part, on a bit rate limitation of the second mobile computing device and thereby cause lowering of processing requirements of the second mobile computing device relative to not dropping the one or more frames; transmitting the audio data stream to the second mobile computing device over the first network protocol for processing by the centralized server; and transmitting the video data stream to the second mobile computing device over the second network protocol for processing by the centralized server separately from the audio data stream responsive to determining whether to drop the one or more frames from the received video data stream.
-
-
35. A computer-readable storage device with an executable program stored thereon, wherein the program instructs a microprocessor of a centralized server to perform the following steps for creating a two-way audio-video conversation between a first mobile computing device and a second mobile computing device, the steps comprising:
-
performing operations for the two-way audio-video conversation at the centralized server on behalf of the first mobile computing device and the second mobile computing device including; receiving an encoded audio data stream over a first transport protocol from the first mobile computing device, the encoded audio data stream comprising a stream of audio packets and an audio bit rate; receiving an encoded video data stream over a second transport protocol from the first mobile computing device, the encoded video data stream comprising a stream of video packets and a video bit rate; receiving feedback data from the second mobile computing device, the feedback data comprising a network bandwidth and a processing bandwidth; determining whether the sum of the audio bit rate and the video bit rate exceeds either the network bandwidth or the processing bandwidth; responsive to determining that the sum of the audio bit rate and the video bit rate exceeds either the network bandwidth or the processing bandwidth, reducing, at the centralized server, the video bit rate of the encoded video data stream received from the first mobile computing device below the network bandwidth and the processing bandwidth; adapting at the centralized server one or more of the audio data stream or video data stream into byte packages that meet a maximum transfer unit (MTU) requirement associated with a wireless network carrier used for the two-way audio-video conversation; transmitting the adapted audio data stream to the second mobile computing device over the first network protocol; and transmitting the adapted video data stream to the second mobile computing device over the second network protocol.
-
-
36. A computer-readable storage device with an executable program stored thereon, wherein the program instructs a microprocessor to perform the following steps at a second mobile computing device for creating a two-way audio-video conversation between a first mobile computing device and the second mobile computing device, the steps comprising:
-
receiving an audio data stream over a first transport protocol, the audio data stream comprising a stream of audio packets, each packet comprising an audio timestamp; receiving a video data stream over a second transport protocol, the video data stream comprising a stream of video packets, each packet comprising a video timestamp, the audio data stream and video data stream received from a centralized server as separate streams and maintained as the separate streams throughout communication from the first mobile computing device to the second mobile computing device, the centralized server configured to perform operations on the separate streams to offload processing for the two-way audio-video conversation, at least one of the streams being adjusted at the centralized server to have byte packages that meet a maximum transfer unit (MTU) requirement associated with a wireless network carrier used for the two-way audio-video conversation; buffering the audio and video data streams in a buffer;
synching the audio data stream with the video data stream, the synching comprising matching each audio timestamp and audio packet with a video timestamp and video packet;if a matching video timestamp is present in the buffer; outputting the synched audio data stream through an audio subsystem, the audio subsystem of the second mobile computing device being configured to decode the synched audio data stream; and outputting the synched video data stream concurrently with outputting the synched audio data stream, the video subsystem of the second mobile computing device being configured to decode the synched video data stream.
-
Specification