CONTENT-BASED ZOOMING AND PANNING FOR VIDEO CURATION

US 20160381306A1
Filed: 06/29/2015
Published: 12/29/2016
Est. Priority Date: 06/29/2015
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method of simulating panning and zooming in video data, the method comprising:

receiving panoramic video data comprising video frames having a first aspect ratio, the panoramic video data showing a plurality of directional views of a scene;

identifying an object of interest represented in the panoramic video data;

determining a beginning of an event of interest involving the object, the beginning corresponding to a first video frame of the panoramic video data, the first video frame showing a first directional view of the plurality of directional views;

determining first pixel coordinates associated with the object in the first video frame;

determining a first cropped window from the first video frame, the first cropped window comprising a portion of the first video frame including the first pixel coordinates, the first cropped window having a second aspect ratio less than the first aspect ratio and the first cropped window having a first size and a first position within the first video frame;

determining an end of the event in a second video frame of the panoramic video data, the second video frame subsequent to the first video frame and the second video frame showing a second directional view of the plurality of directional views;

determining second pixel coordinates associated with the object in the second video frame;

determining a second cropped window from the second video frame, the second cropped window comprising a portion of the second video frame including the second pixel coordinates and the second cropped window having a second size and a second position within the second video frame; and

determining output video data including the first cropped window and the second cropped window.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Devices, systems and methods are disclosed for identifying content in video data and creating content-based zooming and panning effects to emphasize the content. Contents may be detected and analyzed in the video data using computer vision, machine learning algorithms or specified through a user interface. Panning and zooming controls may be associated with the contents, panning or zooming based on a location and size of content within the video data. The device may determine a number of pixels associated with content and may frame the content to be a certain percentage of the edited video data, such as a close-up shot where a subject is displayed as 50% of the viewing frame. The device may identify an event of interest, may determine multiple frames associated with the event of interest and may pan and zoom between the multiple frames based on a size/location of the content within the multiple frames.

35 Citations

View as Search Results

20 Claims

1. A computer-implemented method of simulating panning and zooming in video data, the method comprising:
- receiving panoramic video data comprising video frames having a first aspect ratio, the panoramic video data showing a plurality of directional views of a scene;
  
  identifying an object of interest represented in the panoramic video data;
  
  determining a beginning of an event of interest involving the object, the beginning corresponding to a first video frame of the panoramic video data, the first video frame showing a first directional view of the plurality of directional views;
  
  determining first pixel coordinates associated with the object in the first video frame;
  
  determining a first cropped window from the first video frame, the first cropped window comprising a portion of the first video frame including the first pixel coordinates, the first cropped window having a second aspect ratio less than the first aspect ratio and the first cropped window having a first size and a first position within the first video frame;
  
  determining an end of the event in a second video frame of the panoramic video data, the second video frame subsequent to the first video frame and the second video frame showing a second directional view of the plurality of directional views;
  
  determining second pixel coordinates associated with the object in the second video frame;
  
  determining a second cropped window from the second video frame, the second cropped window comprising a portion of the second video frame including the second pixel coordinates and the second cropped window having a second size and a second position within the second video frame; and
  
  determining output video data including the first cropped window and the second cropped window.
- View Dependent Claims (2, 3, 4)
- - 2. The computer-implemented method of claim 1, further comprising:
    - identifying a static object within a first number of pixels of the object of interest in the first video frame;
      
      determining that the static object does not move during the event of interest;
      
      determining the static object is low priority;
      
      identifying a person of interest within the first number of pixels of the object of interest in the first video frame;
      
      determining the person of interest is high priority to indicate that the first cropped window include the person of interest;
      
      determining the first pixel coordinates to include the object of interest and the person of interest; and
      
      determining the first cropped window to include the first pixel coordinates.
  - 3. The computer-implemented method of claim 1, wherein:
    - determining the second cropped window comprises determining the second position relative to the second video frame is different from the first position relative to the first video frame; and
      
      determining output video data comprises determining output video data simulating panning from the first cropped window at the first position to the second cropped window at the second position.
  - 4. The computer-implemented method of claim 1, wherein:
    - determining the second cropped window comprises determining the second size is different from the first size; and
      
      determining output video data comprises determining output video data simulating zooming from the first cropped window having the first size to the second cropped window having the second size.

5. A computer-implemented method comprising:
- receiving input video data comprising video frames having a first aspect ratio greater than 2;
  
  1;
  
  determining an event of interest represented in the video data;
  
  determining a beginning of the event in a first video frame of the video data;
  
  determining first pixel coordinates in the first video frame associated with the beginning of the event;
  
  determining a first cropped window from the first video frame, the first cropped window comprising a portion of the first video frame including the first pixel coordinates, the first cropped window having a second aspect ratio less than 2;
  
  1 and the first cropped window having a first size and a first position within the first video frame;
  
  determining an end of the event in a second video frame of the video data;
  
  determining second pixel coordinates in the second video frame associated with the end of the event, the second pixel coordinates different than the first pixel coordinates;
  
  determining a second cropped window from the second video frame, the second cropped window comprising a portion of the second video frame including the second pixel coordinates, the second cropped window having the second aspect ratio and the second cropped window having a second size and a second position within the video frame; and
  
  determining output data corresponding to the first cropped window and the second cropped window.
- View Dependent Claims (6, 7, 8, 9, 10, 11, 12)
- - 6. The computer-implemented method of claim 5, further comprising:
    - identifying an object of interest in the video data; and
      
      tracking the object of interest across multiple video frames, wherein, prior to determining the beginning of the event and determining the end of the event, the determining the event further comprises;
      
      determining a third video frame corresponding to the event of interest based on the object of interest.
  - 7. The computer-implemented method of claim 5, wherein determining the output data further comprises:
    - determining output video data simulating at least one of panning and zooming from the first cropped window to the second cropped window.
  - 8. The computer-implemented method of claim 5, wherein determining the output data further comprises:
    - generating a first video tag corresponding to the first cropped window, the first video tag including the first pixel coordinates, the first size, the first position and a first timestamp associated with the first video frame; and
      
      generating a second video tag corresponding to the second cropped window, the second video tag including the second pixel coordinates, the second size, the second position and a second timestamp associated with the second video frame.
  - 9. The computer-implemented method of claim 5, wherein determining the event of interest further comprises:
    - identifying a first person represented in the video data;
      
      identifying a second person represented in the video data;
      
      determining, at a first time, that a first number of pixels between the first person and the second person in the video data exceeds a threshold; and
      
      determining, at a second time following the first time, that a second number of pixels between the first person and the second person in the video data is less than the threshold, wherein the second time is associated with the event of interest.
  - 10. The computer-implemented method of claim 5, further comprising:
    - determining a first direction between the first pixel coordinates and the second pixel coordinates,wherein the determining the first cropped window further comprises;
      
      determining the first cropped window, the first cropped window comprising a portion of the first image including the first pixel coordinates and an area of pixels in the first direction from the first pixel coordinates.
  - 11. The computer-implemented method of claim 5, wherein:
    - determining the second cropped window comprises determining the second position relative to the second video frame is different from the first position relative to the first video frame; and
      
      determining output video data comprises determining output video data simulating panning from the first cropped window at the first position to the second cropped window at the second position.
  - 12. The computer-implemented method of claim 5, wherein:
    - determining the second cropped window comprises determining the second size is different from the first size; and
      
      determining output video data comprises determining output video data simulating zooming from the first cropped window having the first size to the second cropped window having the second size.

13. A system, comprising:
- at least one processor;
  
  a memory including instructions operable to be executed by the at least one processor to cause the system to perform a set of actions comprising;
  
  receiving input video data comprising video frames having a first aspect ratio greater than 2;
  
  1;
  
  determining an event of interest represented in the video data;
  
  determining a beginning of the event in a first video frame of the video data;
  
  determining first pixel coordinates in the first video frame associated with the beginning of the event;
  
  determining a first cropped window from the first video frame, the first cropped window comprising a portion of the first video frame including the first pixel coordinates, the first cropped window having a second aspect ratio less than 2;
  
  1 and the first cropped window having a first size and a first position within the first video frame;
  
  determining an end of the event in a second video frame of the video data;
  
  determining second pixel coordinates in the second video frame associated with the end of the event, the second pixel coordinates different than the first pixel coordinates;
  
  determining a second cropped window from the second video frame, the second cropped window comprising a portion of the second video frame including the second pixel coordinates, the second cropped window having the second aspect ratio and the second cropped window having a second size and a second position within the video frame; and
  
  determining output data corresponding to the first cropped window and the second cropped window.
- View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
- - 14. The system of claim 13, the set of actions further comprising:
    - identifying an object of interest in the video data;
      
      tracking the object of interest across multiple video frames; and
      
      determining, prior to determining the beginning of the event and determining the end of the event, a third video frame corresponding to the event of interest based on the object of interest.
  - 15. The system of claim 14, the set of actions further comprising:
    - determining a first color histogram corresponding to the object;
      
      determining a second color histogram corresponding to third video frame; and
      
      comparing the first color histogram with the second color histogram.
  - 16. The system of claim 13, the set of actions further comprising:
    - generating a first video tag corresponding to the first cropped window, the first video tag including the first pixel coordinates, the first size, the first position and a first timestamp associated with the first video frame; and
      
      generating a second video tag corresponding to the second cropped window, the second video tag including the second pixel coordinates, the second size, the second position and a second timestamp associated with the second video frame.
  - 17. The system of claim 13, the set of actions further comprising:
    - identifying a first person represented in the video data;
      
      identifying a second person represented in the video data;
      
      determining, at a first time, that a first number of pixels between the first person and the second person in the video data exceeds a threshold; and
      
      determining, at a second time following the first time, that a second number of pixels between the first person and the second person in the video data is less than the threshold, wherein the second time is associated with the event of interest.
  - 18. The system of claim 13, the set of actions further comprising:
    - determining a first direction between the first pixel coordinates and the second pixel coordinates; and
      
      determining the first cropped window, the first cropped window comprising a portion of the first image including the first pixel coordinates and an area of pixels in the first direction from the first pixel coordinates.
  - 19. The system of claim 13, the set of actions further comprising:
    - determining the second cropped window comprises determining the second position relative to the second video frame is different from the first position relative to the first video frame; and
      
      determining output video data comprises determining output video data simulating panning from the first cropped window at the first position to the second cropped window at the second position.
  - 20. The system of claim 13, the set of actions further comprising:
    - determining the second cropped window comprises determining the second size is different from the first size; and
      
      determining output video data comprises determining output video data simulating zooming from the first cropped window having the first size to the second cropped window having the second size.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Original Assignee
Amazon Technologies, Inc. (Amazon.com, Inc.)
Inventors
Yang, Yinfei, Welbourne, William Evan, Yu, Tsz Ho, Roessler, Ross David, Savastinuk, Paul Aksenti, Kuo, Cheng-Hao, Thomas, Jim Oommen

Granted Patent

US 9,973,711 B2
Time in Patent Office

Days
Field of Search
US Class Current

1/1
CPC Class Codes

G06T 3/40   Scaling of whole images or ...

G06V 10/25   Determination of region of ...

G06V 20/40   in video content extracting...

G06V 20/47   Detecting features for summ...

G11B 27/031   Electronic editing of digit...

G11B 27/06   Cutting and rejoining; Notc...

H04N 23/698   for achieving an enlarged f...

H04N 5/2628   Alteration of picture size,...

CONTENT-BASED ZOOMING AND PANNING FOR VIDEO CURATION

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

35 Citations

20 Claims

Specification

Use Cases

Quick Links

Others

CONTENT-BASED ZOOMING AND PANNING FOR VIDEO CURATION

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

35 Citations

20 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others