N-view synthesis from monocular video of certain broadcast and stored mass media content

US 6,965,379 B2
Filed: 05/08/2001
Issued: 11/15/2005
Est. Priority Date: 05/08/2001
Status: Expired due to Fees

First Claim

Patent Images

1. An image processing method for use on a data processing device, the method comprising the acts of:

receiving at least one monocular video input image I_k;

segmenting at least one foreground object from the input image I_k;

wherein the act of segmenting at least one foreground object from the input image further comprises;

applying a homography transformation H_kto the at least one monocular video input image I_kto create at least one transformed image J_k;

combining the at least one transformed images J_kto create a mosaic M;

applying a median filter to the multiple values at each pixel of said mosaic M to derive a median value at each of said pixels in said mosaic M;

applying an inverse homography transformation H_k^−

1to said mosaic M to derive at least one background image B_k;

comparing the at least one background image B_kwith the at least one input image I_kto create at least one mask image M_k;

extracting those pixels from the monocular input image I_kthat are set to one in the mask image M; and

setting the remaining pixels in the monocular input image I_knot set to one at said extracting act in the mask image M to black resulting in the identification of said at least one foreground object from the input image I_k;

applying a respective left (TLm) and right (TRm) transformation to each segmented foreground object and a respective left (HL) and right (HR) background transformation to the background, for each of a plurality of output images;

combining the respective left transformation (TL_m) corresponding to each segmented foreground object with the respective left background transformation (HL) corresponding to the background to generate a left view L_kfor each of said plurality of output images;

combining the respective right transformation (TRm) corresponding to each segmented foreground object with the respective right background transformation (HR) corresponding to the background to generate a right view R_kfor each of said plurality of output images; and

deriving the plurality of output images from the results of the respective transformations.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A monocular input image is transformed to give it an enhanced three dimensional appearance by creating at least two output images. Foreground and background objects are segmented in the input image and transformed differently from each other, so that the foreground objects appear to stand out from the background. Given a sequence of input images, the foreground objects will appear to move differently from the background objects in the output images.

44 Citations

View as Search Results

30 Claims

1. An image processing method for use on a data processing device, the method comprising the acts of:
- receiving at least one monocular video input image I_k;
  
  segmenting at least one foreground object from the input image I_k;
  
  wherein the act of segmenting at least one foreground object from the input image further comprises;
  
  applying a homography transformation H_kto the at least one monocular video input image I_kto create at least one transformed image J_k;
  
  combining the at least one transformed images J_kto create a mosaic M;
  
  applying a median filter to the multiple values at each pixel of said mosaic M to derive a median value at each of said pixels in said mosaic M;
  
  applying an inverse homography transformation H_k^−
  
  1to said mosaic M to derive at least one background image B_k;
  
  comparing the at least one background image B_kwith the at least one input image I_kto create at least one mask image M_k;
  
  extracting those pixels from the monocular input image I_kthat are set to one in the mask image M; and
  
  setting the remaining pixels in the monocular input image I_knot set to one at said extracting act in the mask image M to black resulting in the identification of said at least one foreground object from the input image I_k;
  
  applying a respective left (TLm) and right (TRm) transformation to each segmented foreground object and a respective left (HL) and right (HR) background transformation to the background, for each of a plurality of output images;
  
  combining the respective left transformation (TL_m) corresponding to each segmented foreground object with the respective left background transformation (HL) corresponding to the background to generate a left view L_kfor each of said plurality of output images;
  
  combining the respective right transformation (TRm) corresponding to each segmented foreground object with the respective right background transformation (HR) corresponding to the background to generate a right view R_kfor each of said plurality of output images; and
  
  deriving the plurality of output images from the results of the respective transformations.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method of claim 1, further comprising second segmenting at least one background object from the input image and applying a respective transformation to each segmented background object for each of the plurality of output images.
  - 3. The method of claim 1, wherein there are two output images and two respective transformations are applied to each segmented object and two transformations are applied to the background to create the two output images.
  - 4. The method of claim 1, further comprising displaying the plurality of output images, so that the plurality of output images are perceivable by a user as a single image having enhanced three dimensional appearance.
  - 5. The method of claim 1, wherein the respective transformations wherein the respective transformations applied to the foreground object make the foreground object stand out from the background.
  - 6. The method of claim 1, whereinthe receiving comprises receiving a multiplicity of monocular input images;
    - the deriving comprises deriving a respective plurality of output images for each of the monocular input images;
      
      the method further comprises displaying the respective pluralities of output images in a combining device, so that the respective pluralities of output images are perceivable by a user as a sequence of single images giving an illusion of motion and having an enhanced three dimensional appearance in which the at least one foreground object moves separately from the at least one background object.
  - 7. The method of claim 6, wherein the at least one foreground object appears to move in the output images, while at least a portion of the rest of the image appears not to move.
  - 8. The method of claim 1, wherein the segmenting and applying involve using domain knowledge to recognize positions of expected objects in the monocular input image and derive positions of objects in the output images.
  - 9. The method of claim 1, wherein the respective transformations for background pixels are derived by comparing at least two monocular input images of a single scene.
  - 10. The method of claim 1, further comprising, prior to applying the transformation, approximating a position of each segmented object as appearing on a fronto-parallel plane.

11. An image processing device comprisingan input for receiving at least one monocular video input image;
- at least one processor adapted to perform the following operations segmenting at least one foreground object from the input image;
  
  wherein the operation of segmenting at least one foreground object from the input image, further comprises;
  
  applying a homography transformation H_kto the at least one monocular video input image I_kto create at least one transformed image J_k;
  
  combining the at least one transformed images J_kto create a mosaic M;
  
  applying a median filter to the multiple values at each pixel of said mosaic M to derive a median value at each of said pixels in said mosaic M;
  
  applying an inverse homography transformation H_k^−
  
  1to said mosaic M to derive at least one background image B_k;
  
  comparing the at least one background image B_kwith the at least one input image I_kto create at least one mask image M_k;
  
  extracting those pixels from the monocular input image I_kthat are set to one in the mask image M; and
  
  setting the remaining pixels in the monocular input image I_knot set to one at said extracting act in the mask image M to black resulting in the identification of said at least one foreground object from the input image I_k;
  
  applying a respective left (TLm) and right (TRm) transformation to each segmented foreground object and a respective left (HL) and right (HR) background transformation to the background, for each of the plurality of output images;
  
  combining the respective left transformation (TL_m) corresponding to each segmented foreground object with the respective left background transformation (HL) corresponding to the background to generate a left view L_kfor each of said plurality of output images;
  
  combining the respective right transformation (TRm) corresponding to each segmented foreground object with the respective right background transformation (HR) corresponding to the background to generate a right view R_kfor each of said plurality of output images; and
  
  deriving the plurality of output images from the results of the respective transformations.
- View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20, 27)
- - 12. The device of claim 11, wherein the operations further comprise second segmenting at least one background object from the input image and applying a respective transformation to each segmented background object for each of the plurality of output images.
  - 13. The device of claim 11, whereinthere are two output images;
    - andthe operations further comprise, in order to create the two output images;
      
      applying two respective transformations to each segmented object; and
      
      further applying two transformations to the background.
  - 14. The device of claim 11, further comprising a combining display unit adapted to receive and display the plurality of output images, so that the plurality of output images are perceivable by a user as a single image having enhanced three dimensional appearance.
  - 15. The device of claim 11, wherein the respective transformations applied to the foreground object make the foreground object stand out from the background.
  - 16. The device of claim 15, whereinthe receiving comprises receiving a multiplicity of monocular input images;
    - the deriving comprises deriving a respective plurality of output images for each of the monocular input images;
      
      the device further comprises a combining display unit for receiving and displaying the respective pluralities of output images, so that the respective pluralities of output images are perceivable by a user as a sequence of single images giving an illusion of motion and having an enhanced three dimensional appearance in which the at least one foreground object moves separately from the at least one background object.
  - 17. The device of claim 16, wherein the at least one foreground object appears to move in the output images, while at least a portion of the rest of the image appears not to move.
  - 18. The device of claim 11, wherein the segmenting and applying operations involve using domain knowledge to recognize positions of expected objects in the monocular input image and derive positions of objects in the output images.
  - 19. The device of claim 11, wherein the respective transformations for background pixels are derived by comparing at least two monocular input images of a single scene.
  - 20. The device of claim 11, wherein the operations further comprise, prior to applying the transformation, approximating a position of each segmented object as appearing on a fronto-parallel plane.
  - 27. The device of claim 16, wherein the at least one foreground object appears to move in the output images, while at least a portion of the rest of the image appears not to move.

21. At least one medium readable by a data processing device and embodying code for causing execution of the following operations:
- receiving at least one monocular video input image;
  
  wherein the operation of segmenting at least one foreground object from the input image, further comprises;
  
  applying a homography transformation H_kto the at least one monocular video input image I_kto create at least one transformed image J_k;
  
  combining the at least one transformed images J_kto create a mosaic M;
  
  applying a median filter to the multiple values at each pixel of said mosaic M to derive a median value at each of said pixels in said mosaic M;
  
  applying an inverse homography transformation H_k^−
  
  1to said mosaic M to derive at least one background image B_k;
  
  comparing the at least one background image B_kwith the at least one input image I_kto create at least one mask image M_k;
  
  extracting those pixels from the monocular input image I_kthat are set to one in the mask image M; and
  
  setting the remaining pixels in the monocular input image I_knot set to one at said extracting act in the mask image M to black resulting in the identification of said at least one foreground object from the input image I_k;
  
  segmenting at least one foreground object from the input imageapplying a respective left (TLm) and right (TRm) transformation to each segmented foreground object and a respective left (HL) and right (HR) background transformation to the background, for each of the plurality of output images;
  
  combining the respective left transformation (TL_m) corresponding to each segmented foreground object with the respective left background transformation (HL) corresponding to the background to generate a left view L_kfor each of said plurality of output images;
  
  combining the respective right transformation (TRm) corresponding to each segmented foreground object with the respective right background transformation (HR) corresponding to the background to generate a right view R_kfor each of said plurality of output images.
- View Dependent Claims (22, 23, 24, 25, 26, 28, 29, 30)
- - 22. The medium of claim 21, wherein the operations further comprise second segmenting at least one background object from the input image and applying a respective transformation to each segmented background object for each of the plurality of output images.
  - 23. The medium of claim 21, wherein there are two output images and two respective transformations are applied to each segmented object and two transformations are applied to the background to create the two output images.
  - 24. The medium of claim 21, wherein the operations further comprise displaying the plurality of output images in a combining device, so that the plurality of output images are perceivable by a user as a single image having enhanced three dimensional appearance.
  - 25. The medium of claim 21, wherein the respective transformations applied to the foreground object make the foreground object stand out from the background.
  - 26. The medium of claim 25, wherein The receiving comprises receiving a multiplicity of monocular input images;
    - the deriving comprises deriving a respective plurality of output images for each of the monocular input images;
      
      the operations further comprise displaying the respective pluralities of output images in a combining device, so that the respective pluralities of output images are perceivable by a user as a sequence of single images giving an illusion of motion and having an enhanced three dimensional appearance in which the at least one foreground object moves separately from the at least one background object.
  - 28. The medium of claim 21, wherein the segmenting and applying operations involve using domain knowledge to recognize positions of expected objects in the monocular input image and derive positions of objects in the output images.
  - 29. The medium of claim 21, wherein the respective transformations for background pixels are derived by comparing at least two monocular input images of a single scene.
  - 30. The medium of claim 21, wherein the operations further comprise, prior to applying the transformation, approximating a position of each segmented object as appearing on a fronto-parallel plane.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Funai Electric Co., Ltd.
Original Assignee
Koninklijke Philips Electronics N.V. (Koninklijke Philips N.V.)
Inventors
Lee, Mi-Suen, Brodsky, Tomas, Weinshall, Daphna, Trajkovic, Miroslav
Primary Examiner(s)
Chauhan, Ulka J.
Assistant Examiner(s)
PAPPAS, PETER

Application Number

US09/851,445
Publication Number

US 20020167512A1
Time in Patent Office

1,652 Days
Field of Search

345/427, 382/154
US Class Current

345/427
CPC Class Codes

G06T 2207/10012   Stereo images

G06T 7/85   Stereo camera calibration

H04N 13/111   Transformation of image sig...

H04N 13/117   the virtual viewpoint locat...

H04N 13/194   Transmission of image signals

H04N 13/221   using the relative movement...

H04N 13/246   Calibration of cameras

H04N 13/261   with monoscopic-to-stereosc...

H04N 13/289   Switching between monoscopi...

H04N 19/20   using video object coding

H04N 19/23   with coding of regions that...

H04N 2013/0092   Image segmentation from ste...

N-view synthesis from monocular video of certain broadcast and stored mass media content

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

44 Citations

30 Claims

Specification

Use Cases

Quick Links

Others

N-view synthesis from monocular video of certain broadcast and stored mass media content

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

44 Citations

30 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others