Apparatus and method for panoramic video hosting with directional audio

US 9,723,223 B1
Filed: 09/27/2013
Issued: 08/01/2017
Est. Priority Date: 12/02/2011
Status: Expired due to Fees

First Claim

Patent Images

1. A server, programmed to:

receive first video data comprising a first frame captured by a first camera;

receive second video data comprising a second frame captured by a second camera;

receive a first audio track including audio data captured by a first microphone sensitive to sound received from a first microphone direction;

receive a second audio track including audio data captured by a second microphone sensitive to sound received from a second microphone direction different than the first microphone direction;

generate a first panoramic frame based at least in part on the first frame and the second frame;

receive first view direction data describing a first view direction offset from the first microphone direction and offset from the second microphone direction;

form a first stereo audio track based at least in part upon the first view direction, wherein the first stereo audio track comprises;

a first left channel track comprising a first weighted combination of the first audio track and the second audio track, wherein the first weighted combination is generated by applying a first weight to the first audio track based at least in part on the offset of the first view direction from the first microphone direction and applying a second weight to the second audio track based at least in part on the offset of the first view direction from the second microphone direction; and

a first right channel track comprising a second weighted combination of the first audio track and the second audio track, wherein the second weighted combination is generated by applying a third weight to the first audio track based at least in part on the offset of the first view direction from the first microphone direction and applying a fourth weight to the second audio track based at least in part on the offset of the first view direction from the second microphone direction;

send the first stereo audio track to a client device;

receive second view direction data describing a second view direction different than the first view direction;

form a second stereo audio track based at least in part upon the second view direction, wherein the second stereo audio track comprises;

a second left channel track comprising the second weighted combination of the first audio track and the second audio track; and

a second right channel track comprising the first weighted combination of the first audio track and the second audio track; and

send the second stereo audio track to the client device.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A server includes an input node to receive video streams forming a panoramic video. The server also receives audio tracks corresponding to the video streams. A module forms an audio track based upon a combination of at least two of the audio tracks and directional viewing data. The audio track may be a stereo, mixed or surround sound audio track with volume modulation based upon the directional viewing data. An output node sends the audio track to a client device.

Citations

17 Claims

1. A server, programmed to:
- receive first video data comprising a first frame captured by a first camera;
  
  receive second video data comprising a second frame captured by a second camera;
  
  receive a first audio track including audio data captured by a first microphone sensitive to sound received from a first microphone direction;
  
  receive a second audio track including audio data captured by a second microphone sensitive to sound received from a second microphone direction different than the first microphone direction;
  
  generate a first panoramic frame based at least in part on the first frame and the second frame;
  
  receive first view direction data describing a first view direction offset from the first microphone direction and offset from the second microphone direction;
  
  form a first stereo audio track based at least in part upon the first view direction, wherein the first stereo audio track comprises;
  
  a first left channel track comprising a first weighted combination of the first audio track and the second audio track, wherein the first weighted combination is generated by applying a first weight to the first audio track based at least in part on the offset of the first view direction from the first microphone direction and applying a second weight to the second audio track based at least in part on the offset of the first view direction from the second microphone direction; and
  
  a first right channel track comprising a second weighted combination of the first audio track and the second audio track, wherein the second weighted combination is generated by applying a third weight to the first audio track based at least in part on the offset of the first view direction from the first microphone direction and applying a fourth weight to the second audio track based at least in part on the offset of the first view direction from the second microphone direction;
  
  send the first stereo audio track to a client device;
  
  receive second view direction data describing a second view direction different than the first view direction;
  
  form a second stereo audio track based at least in part upon the second view direction, wherein the second stereo audio track comprises;
  
  a second left channel track comprising the second weighted combination of the first audio track and the second audio track; and
  
  a second right channel track comprising the first weighted combination of the first audio track and the second audio track; and
  
  send the second stereo audio track to the client device.
- View Dependent Claims (2, 3, 4, 5, 6)
- - 2. The server of claim 1 wherein the first stereo audio track is a surround sound audio track.
  - 3. The server of claim 2 wherein the surround sound audio track is modulated based upon the first view direction.
  - 4. The server of claim 1 wherein a volume of the first stereo audio track is modulated based upon an item of interest in a video file.
  - 5. The server of claim 1 wherein the server is further programmed to send to the client device a portion of the first panoramic frame corresponding to a field of view of a user, and wherein the first stereo audio track is modulated to include an aural clue corresponding to an event of potential interest outside of the field of view of the user.
  - 6. The server of claim 1 wherein the second view direction is offset from the first view direction by about one hundred and eighty degrees.

7. A computer-implemented method of generating a panoramic video, the method comprising:
- receiving, by a server, first video data comprising a first frame captured by a first camera;
  
  receiving, by the server, second video data comprising a second frame captured by a second camera;
  
  receiving, by the server, a first audio track including audio data captured by a first microphone sensitive to sound received from a first microphone direction;
  
  receiving, by the server, a second audio track including audio data captured by a second microphone sensitive to sound received from a second microphone direction different than the first microphone direction;
  
  generating, by the server, a first panoramic frame based at least in part on the first frame and the second frame;
  
  receiving, by the server, first view direction data describing a first view direction offset from the first microphone direction and offset from the second microphone direction;
  
  forming, by the server, a first stereo audio track based at least in part upon the first view direction, wherein the first stereo audio track comprises;
  
  a first left channel track comprising a first weighted combination of the first audio track and the second audio track, wherein the first weighted combination is generated by applying a first weight to the first audio track based at least in part on the offset of the first view direction from the first microphone direction and applying a second weight to the second audio track based at least in part on the offset of the first view direction from the second microphone direction; and
  
  a first right channel track comprising a second weighted combination of the first audio track and the second audio track, wherein the second weighted combination is generated by applying a third weight to the first audio track based at least in part on the offset of the first view direction from the first microphone direction and applying a fourth weight to the second audio track based at least in part on the offset of the first view direction from the second microphone direction;
  
  sending, by the server, the first stereo audio track to a client device;
  
  receiving, by the server, second view direction data describing a second view direction different than the first view direction;
  
  forming, by the server, a second stereo audio track based at least in part on the second view direction, wherein the second stereo audio track comprises;
  
  a second left channel track comprising the second weighted combination of the first audio track and the second audio track; and
  
  a second right channel track comprising the first weighted combination of the first audio track and the second audio track; and
  
  sending, by the server, the second stereo audio track to the client device.
- View Dependent Claims (8, 9, 10, 11, 12)
- - 8. The method of claim 7 wherein the second view direction is offset from the first view direction by about one hundred and eighty degrees.
  - 9. The method of claim 7 wherein the first stereo audio track is a surround sound audio track.
  - 10. The method of claim 9 wherein the surround sound audio track is modulated based upon the first view direction.
  - 11. The method of claim 7 wherein a volume of the first stereo audio track is modulated based upon an item of interest in a video file.
  - 12. The method of claim 7 wherein the server is further programmed to send to the client device a portion of the first panoramic frame corresponding to a field of view of a user, and wherein the first stereo audio track is modulated to include an aural clue corresponding to an event of potential interest outside of a field of view of the user.

13. A server comprising a non-transitory computer readable storage medium including computer code for performing a method comprising:
- receiving first video data comprising a first frame captured by a first camera;
  
  receiving second video data comprising a second frame captured by a second camera;
  
  receiving a first audio track including audio data captured by a first microphone sensitive to sound received from a first microphone direction;
  
  receiving a second audio track including audio data captured by a second microphone sensitive to sound received from a second microphone direction different than the first microphone direction;
  
  generating a first panoramic frame based at least in part on the first frame and the second frame;
  
  receiving first view direction data describing a first view direction offset from the first microphone direction by a first amount and offset from the second microphone direction by a second amount;
  
  forming a first stereo audio track based at least in part upon the first view direction, wherein the first stereo audio track comprises;
  
  a first left channel track comprising a first weighted combination of the first audio track and the second audio track, wherein the first weighted combination is generated by applying a first weight to the first audio track based at least in part on the first amount of offset of the first view direction from the first microphone direction and applying a second weight to the second audio track based at least in part on the second amount of offset of the first view direction from the second microphone direction; and
  
  a first right channel track comprising a second weighted combination of the first audio track and the second audio track, wherein the second weighted combination is generated by applying a third weight to the first audio track based at least in part on the first amount of offset of the first view direction from the first microphone direction and applying a fourth weight to the second audio track based at least in part on the second amount of offset of the first view direction from the second microphone direction;
  
  sending the first stereo audio track to a client device;
  
  receiving second view direction data describing a second view direction different than the first view direction, the second view direction being offset from the first microphone direction by a third amount and offset from the second microphone direction by a fourth amount;
  
  forming a second stereo audio track based at least in part upon the second view direction, wherein the second stereo audio track comprises;
  
  a second left channel track comprising the second weighted combination of the first audio track and the second audio track; and
  
  a second right channel track comprising the first weighted combination of the first audio track and the second audio track; and
  
  sending the second stereo audio track to the client device.
- View Dependent Claims (14, 15, 16, 17)
- - 14. The server of claim 13, wherein the method further comprises sending to the client device a portion of the first panoramic frame corresponding to a field of view of a user, and wherein the first stereo audio track is modulated to include an aural clue corresponding to an event of potential interest outside of the field of view of the user.
  - 15. The server of claim 13, wherein the second view direction is offset from the first view direction by about one hundred and eighty degrees.
  - 16. The server of claim 13 wherein the first stereo audio track is a surround sound audio track.
  - 17. The server of claim 16 wherein the surround sound audio track is modulated based upon the first view direction.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Original Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Inventors
Banta, Bill, Alioshin, Paul
Primary Examiner(s)
Vo, Tung
Assistant Examiner(s)
Jiang, Zaihan

Application Number

US14/040,435
Time in Patent Office

1,404 Days
Field of Search

348 38
US Class Current
CPC Class Codes

G11B 27/031   Electronic editing of digit...

H04N 21/21805   enabling multiple viewpoint...

H04N 21/2743   Video hosting of uploaded d...

H04N 21/8106   involving special audio dat...

H04N 21/816   involving special video dat...

H04N 23/698   for achieving an enlarged f...

H04N 5/265   Mixing

Apparatus and method for panoramic video hosting with directional audio

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Apparatus and method for panoramic video hosting with directional audio

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links