Systems and methods for the autonomous production of videos from multi-sensored data

US 8,854,457 B2
Filed: 05/07/2010
Issued: 10/07/2014
Est. Priority Date: 05/07/2009
Status: Active Grant

First Claim

Patent Images

1. A computer based camerawork method for autonomous production of an edited video from multiple video streams captured by a plurality of fixed and/or motorized cameras distributed around a scene of interest, that selects, based on a known location of a set of objects-of-interest and as a function of time, sequences of optimal viewpoints to fit a display resolution and user preferences, and for smoothing these sequences of optimal viewpoints for a continuous and graceful story-telling, the camerawork method comprising:

selecting, for each envisioned camera location and/or position, a field of view obtained by;

either cropping an image captured by a fixed camera, thereby defining image cropping parameters, orselecting pan-tilt-zoom parameters for a virtual or motorized camera,wherein, as part of said field of view selection, objects-of-interest are included and the field of view is selected based on joint processing of the positions of the multiple objects-of-interest that have been detected, andwherein the selection of the field of view is done in a way that balances completeness and closeness metrics as a function of individual user preferences, wherein completeness counts a number of objects-of-interest that are included and visible within the displayed viewpoint, and closeness measures a number of pixels that are available to describe the objects-of-interest, and wherein said user preferences define a set of parameters that are used to tune the trade-off between completeness and closeness, andautonomously building the edited video by selecting and concatenating video segments provided by one or more individual cameras, wherein the building is done in a way that balances completeness and closeness metrics along the time, while smoothing out the sequence of said cropping and/or pan-tilt-zoom parameters associated to concatenated segments, wherein the smoothing process is implemented based on a linear or non-linear low-pass temporal filter mechanism, and the relative importance of each camera location is tuned according to user preference.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An autonomous computer based method and system is described for personalized production of videos such as team sport videos such as basketball videos from multi-sensored data under limited display resolution. Embodiments of the present invention relate to the selection of a view to display from among the multiple video streams captured by the camera network. Technical solutions are provided to provide perceptual comfort as well as an efficient integration of contextual information, which is implemented, for example, by smoothing generated viewpoint/camera sequences to alleviate flickering visual artifacts and discontinuous story-telling artifacts. A design and implementation of the viewpoint selection process is disclosed that has been verified by experiments, which shows that the method and system of the present invention efficiently distribute the processing load across cameras, and effectively selects viewpoints that cover the team action at hand while avoiding major perceptual artifacts.

103 Citations

View as Search Results

21 Claims

1. A computer based camerawork method for autonomous production of an edited video from multiple video streams captured by a plurality of fixed and/or motorized cameras distributed around a scene of interest, that selects, based on a known location of a set of objects-of-interest and as a function of time, sequences of optimal viewpoints to fit a display resolution and user preferences, and for smoothing these sequences of optimal viewpoints for a continuous and graceful story-telling, the camerawork method comprising:
- selecting, for each envisioned camera location and/or position, a field of view obtained by;
  
  either cropping an image captured by a fixed camera, thereby defining image cropping parameters, orselecting pan-tilt-zoom parameters for a virtual or motorized camera,wherein, as part of said field of view selection, objects-of-interest are included and the field of view is selected based on joint processing of the positions of the multiple objects-of-interest that have been detected, andwherein the selection of the field of view is done in a way that balances completeness and closeness metrics as a function of individual user preferences, wherein completeness counts a number of objects-of-interest that are included and visible within the displayed viewpoint, and closeness measures a number of pixels that are available to describe the objects-of-interest, and wherein said user preferences define a set of parameters that are used to tune the trade-off between completeness and closeness, andautonomously building the edited video by selecting and concatenating video segments provided by one or more individual cameras, wherein the building is done in a way that balances completeness and closeness metrics along the time, while smoothing out the sequence of said cropping and/or pan-tilt-zoom parameters associated to concatenated segments, wherein the smoothing process is implemented based on a linear or non-linear low-pass temporal filter mechanism, and the relative importance of each camera location is tuned according to user preference.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 21)
- - 2. The method of claim 1, further comprising rating the viewpoint selected in each camera view according to the quality of its completeness/closeness trade-off, and to its degree of occlusions.
  - 3. The method of claim 2, wherein the highest rate correspond to a view that makes most object of interest visible, and is close to the action.
  - 4. The method of claim 1, further comprising selecting the optimal field of view in each camera, at a given time instant, wherein a field of view v_kin the k^thcamera view is defined by the size S_kand the center c_kof the window that is cropped in the k^thview for actual display and is selected to include the objects of interest and to provide a high resolution description of the objects, and an optimal field of view v_k* is selected to maximize a weighted sum of object interests as follows
  - 5. The method of claim 4, wherein α
    - ( . . . ) decreases with Sk and the function α
      
      ( . . . ) is equal to one when S_k<
      
      u_res, and decrease afterwards, andwherein α
      
      ( . . . ) is defined optionally by;
  - 6. The method of claim 4, wherein selecting the camera at a given time instant that makes most object of interest visible given the interest I_nof each player, wherein the rate I_k(v_k, u) associated to the k^thcamera view is defined as follows:
  - 7. The method of claim 6, wherein β
    - _k(.) is defined as
      β
      
      _k(S,u)=u_k·
      
      α
      
      (S,u)where u_kdenotes the weight assigned to the k^thcamera, and α
      
      (S,u) is defined as in claim 5.
  - 8. The method of claim 1 further comprising smoothing the sequence of camera indices and corresponding viewpoint parameters, wherein the smoothing process is for example implemented based on two Markov Random Fields, linear or non-linear low-pass filtering mechanism, or via a graph model formalism, solved based on conventional Viterbi algorithm.
  - 9. The computer based camerawork method according to claim 1, wherein the user preferences are at least one of a user preferred device capability, a user preferred group of objects, a user preferred object, a user preferred view type, and a user preferred camera.
  - 21. A non-transitory machine readable signal storage medium storing a computer program product that comprises code segments which when executed on a processing engine execute the method of claim 1 or implement the system according to claim 10.

10. A computer based camerawork system comprising a processing engine and memory for autonomous production of an edited video from multiple video streams captured by a plurality of fixed and/or motorized cameras distributed around a scene of interest, that selects based on known location of a set of objects-of-interest and as a function of time, sequences of optimal viewpoints to fit a display resolution and user preferences, and for smoothing these sequences of optimal viewpoints for a continuous and graceful story-telling, the camerawork system comprising:
- first means for selecting, for each envisioned camera location and/or position, a field of view obtained by;
  
  either cropping an image captured by a fixed camera, thereby defining image cropping parameters, or selecting pan-tilt-zoom parameters of a virtual or motorized camera,wherein, as part of said field of view selection, objects-of-interest are included and the field of view is selected based on joint processing of the positions of the multiple objects-of-interest that have been detected, wherein the selection of the field of view is done in way that balances completeness and closeness metrics as a function of individual user preferences, wherein completeness counts the number of objects-of-interest that are included and visible within the displayed viewpoint, and closeness measures the number of pixels that are available to describe the objects-of-interest, and wherein said user preferences define a set of parameters that are used to tune the trade-off between completeness and closeness, andsecond means for autonomously selecting rendering parameters that maximize and smooth out closeness and completeness metrics by concatenating segments in the video streams provided by one or more individual cameras, wherein the building is done in a way that balances completeness and closeness metrics along the time, while smoothing out the sequence of said cropping and/or pan-tilt-zoom parameters associated to concatenated segments, wherein the smoothing process is implemented based on a linear or non-linear low-pass temporal filtering mechanism, and the relative importance of each camera location is tuned according to user preferences.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
- - 11. The system of claim 10, further comprising third means for selecting camera and image parameter variations for the camera view that render action as a function of time for a set of joint closeness and completeness metrics, the third means being optionally for selecting camera and image parameter variations is adapted to crop in the camera view of a static camera or to control the control parameters of a dynamic camera.
  - 12. The system of claim 10 further comprising fourth means for selecting the variations of parameters that optimize the trade-off between completeness and closeness at each time instant, and for each camera view, wherein the completeness/closeness trade-off is optionally measured as a function of the user preferences.
  - 13. The system of claim 10, further comprising means for rating the viewpoint selected in each camera view according to the quality of its completeness/closeness trade-off, and to its degree of occlusions.
  - 14. The system of claim 13, further comprising means for computing the parameters of an optimal virtual camera that pans, zooms and switches across views to preserve high ratings of selected viewpoints while minimizing the amount of virtual camera movements, for the temporal segment at hand.
  - 15. The system of claim 14, further comprising sixth means for selecting the camera at a given time instant that makes most object of interest visible, and is close to the action, whereby an optimal camera index k* is selected according to an equation that is similar or equivalent to:
  - 16. The system of claim 13, further comprising fifth means for selecting the optimal viewpoint in each camera view, at a given time instant, wherein the fifth means for selecting the optimal viewpoint is adapted, for a viewpoint v_kin the k^thcamera view is defined by the size S_kand the center c_kof the window that is cropped in the k^thview for actual display and is selected to include the objects of interest and to provide a high resolution, is adapted to select a description of the objects and an optimal viewpoint v_k* to maximize a weighted sum of object interests as follows
  - 17. The system of claim 16, wherein α
    - ( . . . ) decreases with Sk and the function α
      
      ( . . . ) is equal to one when S_k<
      
      u_res, and decrease afterwards, wherein α
      
      ( . . . ) is optionally defined by;
  - 18. The system of claim 15, wherein β
    - _k(.) is defined as
      β
      
      _k(S,u)=u_k·
      
      α
      
      (S,u),where u_kdenotes the weight assigned to the k^thcamera, and α
      
      (S,u) is defined as in claim 17.
  - 19. The system of claim 17 further comprising means for smoothing the sequence of camera indices and corresponding viewpoint parameters, wherein the means for smoothing is adapted to smooth based on two Markov Random Fields, by a linear or non-linear low-pass filtering mechanism, by a graph model formalism, solved based on conventional Viterbi algorithm.
  - 20. The computer based camerawork system according to claim 10, wherein the user preferences are at least one of a user preferred device capability, a user preferred group of objects, a user preferred object, a user preferred view type, and a user preferred camera.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Keemotion S.A.
Original Assignee
Université catholique de Louvain
Inventors
De Vleeschouwer, Christophe, Chen, Fan
Primary Examiner(s)
Topgyal, Gelek W

Application Number

US13/319,202
Publication Number

US 20120057852A1
Time in Patent Office

1,614 Days
Field of Search

348135-169
US Class Current

348/135
CPC Class Codes

G11B 27/034   on discs G11B27/036, G11B27...

G11B 27/105   of operating discs

H04N 5/262   Studio circuits, e.g. for m...

H04N 5/268   Signal distribution or swit...

Systems and methods for the autonomous production of videos from multi-sensored data

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

103 Citations

21 Claims

Specification

Solutions

Use Cases

Quick Links

Systems and methods for the autonomous production of videos from multi-sensored data

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

103 Citations

21 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links