Machine synthesis of a virtual video camera/image of a scene from multiple video cameras/images of the scene in accordance with a particular perspective on the scene, an object in the scene, or an event in the scene

US 5,745,126 A
Filed: 06/21/1996
Issued: 04/28/1998
Est. Priority Date: 03/31/1995
Status: Expired due to Term

First Claim

Patent Images

1. A method of presenting a particular two-dimensional video image of a real-world three dimensional scene to a viewer comprising:

imaging in multiple video cameras each at a different spatial location multiple two-dimensional images of a real-world scene each at a different spatial perspective;

combining in a computer the multiple two-dimensional images of the scene into a three-dimensional model of the scene;

receiving in a the computer from a prospective viewer of the scene a viewer-specified criterion relative to which criterion the viewer wishes to view the scene;

synthesizing in the computer from the three-dimensional model a particular two-dimensional image of the scene in accordance with the received viewer criterion; and

displaying in a video display the particular synthesized two-dimensional image of the real-world scene to the viewer.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Each and any viewer of a video or a television scene is his or her own proactive editor of the scene, having the ability to interactively dictate and select--in advance of the unfolding of the scene and by high-level command--a particular perspective by which the scene will be depicted, as and when the scene unfolds. Video images of the scene are selected, or even synthesized, in response to a viewer-selected (i) spatial perspective on the scene, (ii) static or dynamic object appearing in the scene, or (iii) event depicted in the scene. Multiple video cameras, each at a different spatial location, produce multiple two-dimensional video images of the real-world scene, each at a different spatial perspective. Objects of interest in the scene are identified and classified by computer in these two-dimensional images. The two-dimensional images of the scene, and accompanying information, are then combined in the computer into a three-dimensional video database, or model, of the scene. The computer also receives a user/viewer-specified criterion relative to which criterion the user/viewer wishes to view the scene. From the (i) model and (ii) the criterion, the computer produces a particular two-dimensional image of the scene that is in "best" accordance with the user/viewer-specified criterion. This particular two-dimensional image of the scene is then displayed on a video display.

Citations

28 Claims

1. A method of presenting a particular two-dimensional video image of a real-world three dimensional scene to a viewer comprising:
- imaging in multiple video cameras each at a different spatial location multiple two-dimensional images of a real-world scene each at a different spatial perspective;
  
  combining in a computer the multiple two-dimensional images of the scene into a three-dimensional model of the scene;
  
  receiving in a the computer from a prospective viewer of the scene a viewer-specified criterion relative to which criterion the viewer wishes to view the scene;
  
  synthesizing in the computer from the three-dimensional model a particular two-dimensional image of the scene in accordance with the received viewer criterion; and
  
  displaying in a video display the particular synthesized two-dimensional image of the real-world scene to the viewer.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
- - 2. The method according to claim 1wherein the receiving is of the viewer-specified criterion of a particular spatial perspective, relative to which particular spatial perspective the viewer wishes to view the scene;
    - andwherein the synthesizing in the computer from the three-dimensional model is of a particular two-dimensional image of the scene in accordance with the particular spatial perspective criterion received from the viewer; and
      
      wherein the displaying in the video display is of the particular synthesized two-dimensional image of the scene that is in accordance with the particular spatial perspective received from the viewer.
  - 3. The method according to claim 2wherein the synthesizing is of a virtual image that is without correspondence to any of the images of the scene that are imaged by any of the multiple video cameras, this synthesized virtual image being in accordance with the particular spatial perspective criterion received from the viewer.
  - 4. The method according to claim 1wherein the combining is so as generate a three-dimensional model of the scene in which model objects in the scene are identified;
    - wherein the receiving is of the viewer-specified criterion of a selected object that the viewer wishes to particularly view within the scene; and
      
      wherein the synthesizing in the computer from the three-dimensional model is of a particular two-dimensional image of the selected object in the scene; and
      
      wherein the displaying in the video display is of the particular synthesized two-dimensional image of the scene showing the viewer-selected object.
  - 5. The method according to claim 4 wherein the viewer-selected object in the scene is static, and unmoving, in the scene.
  - 6. The method according to claim 4 wherein the viewer-selected object in the scene is dynamic, and moving, in the scene.
  - 7. The method according to claim 4 wherein the viewer selects the object that he or she wishes to particularly view in the scene by act of positioning a cursor on the video display, which cursor unambiguously specifies an object in the scene by an association between the object position and the cursor position in three dimensions and is thus a three-dimensional cursor.
  - 8. The method according to claim 1wherein the combining is so as generate a three-dimensional model of the scene in which model events in the scene are identified;
    - wherein the receiving is of the viewer-specified criterion of a selected event that the viewer wishes to particularly view within the scene; and
      
      wherein the synthesizing in the computer from the three-dimensional model is of a particular two-dimensional image of the selected event in the scene; and
      
      wherein the displaying in the video display is of the particular synthesized two-dimensional image of the scene showing the viewer-selected event.
  - 9. The method according to claim 8 wherein the viewer selects the event that he or she wishes to particularly view in the scene by act of positioning a cursor on the video display, which cursor unambiguously specifies an event in the scene by an association between the event position and the cursor position in three dimensions and is thus a three-dimensional cursor.
  - 10. The method according to claim 1 performed in real time as virtual television presented to a viewer interactively in accordance with the viewer-specified criterion.

11. A method of presenting a particular two-dimensional video image of a real-world three dimensional scene to a viewer comprising:
- imaging in multiple video cameras each at a different spatial location multiple two-dimensional images of a real-world scene each at a different spatial perspective;
  
  combining in a computer the multiple two-dimensional images of the scene into a three-dimensional model of the scene;
  
  receiving in a the computer from a prospective viewer of the scene a viewer-specified criterion relative to which criterion the viewer wishes to view the scene;
  
  synthesizing in the computer from the three-dimensional model a particular two-dimensional image of the scene that is without exact correspondence to any of the images of the real-world scene that are imaged by any of the multiple video cameras in accordance with the received viewer criterion; and
  
  displaying in a video display the particular synthesized two-dimensional image of the real-world scene to the viewer.

12. A method of synthesizing a virtual video image from real video images obtained by a multiple real video cameras, the method comprising:
- storing in a video image database the real two-dimensional video images of a scene from each of a multiplicity of real video cameras;
  
  creating in a computer from the multiplicity of stored two-dimensional video images a three-dimensional video database containing a three-dimensional video image of the scene; and
  
  generating a two-dimensional virtual video image of the scene from the three-dimensional video database.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20)
- - 13. The method according to claim 12 that, between the creating and the generating, further comprises:
    - selecting a spatial perspective, which spatial perspective is not that of any of the multiplicity of real video cameras, on the scene as is imaged within the three-dimensional video database;
      
      wherein the generating of the two-dimensional virtual video image is so as to show the scene from the selected spatial perspective.
  - 14. The method according to claim 13 wherein the selected spatial perspective is static, and fixed, during the video of the scene.
  - 15. The method according to claim 13 wherein the selected spatial perspective is dynamic, and variable, during the video of the scene.
  - 16. The method according to claim 13 wherein the selected spatial perspective is so dynamic and variable dependent upon occurrences in the scene.
  - 17. The method according to claim 12 that, between the creating and the generating, further comprises:
    - locating a selected object in the scene as is imaged within the three-dimensional video database;
      
      wherein the generating of the two-dimensional virtual video image is so as to best show the selected object.
  - 18. The method according to claim 12 that, between the creating and the generating, further comprises:
    - dynamically tracking the scene as is imaged within the three-dimensional video database in order to recognize any occurrence of a predetermined event in the scene;
      
      wherein the generating of the two-dimensional virtual video image is so as to best show the predetermined event.
  - 19. The method according to claim 12 wherein the generating is of a selected two-dimensional virtual video image, on demand.
  - 20. The method according to claim 12 wherein the generating of the selected two-dimensional virtual video image is in real time on demand, thus interactive virtual television.

21. A method of synthesizing a virtual video image from real video images obtained by a multiple real video cameras, the method comprising:
- storing in a video image database the real two-dimensional video images of a scene from each of a multiplicity of real video cameras;
  
  creating in a computer from the multiplicity of stored two-dimensional video images a three-dimensional video database containing a three-dimensional video image of the scene; and
  
  generating a two-dimensional virtual video image of the scene from the three-dimensional video database by selecting from the three-dimensional video database a two-dimensional virtual video image of the scene that corresponds to a real two-dimensional video image of a scene.

22. A system for presenting video images of a real-world scene in accordance with a predetermined criterion, the system comprising:
- multiple video imagers each at a different spatial location for producing multiple two-dimensional video images of a real-world scene each at a different spatial perspective;
  
  a viewer interface at which a prospective viewer of the scene may specify a criterion relative to which criterion the viewer wishes to view the scene;
  
  a computer, receiving the multiple two-dimensional video images of the scene from the multiple video imagers and the viewer-specified criterion from the viewer interface,for producing from the multiple two-dimensional video images of the scene a three-dimensional model of the scene; and
  
  for synthesizing from the three-dimensional model a particular two-dimensional virtual video image of the scene in accordance with the viewer-specified criterion; and
  
  a video display, receiving the particular two-dimensional video image of the scene from the computer, for displaying the particular two-dimensional video image of the real-world scene to the viewer.
- View Dependent Claims (23)
- - 23. The system according to claim 22wherein the viewer interface has and presents a three-dimensional cursor manipulatable by a prospective viewer of the scene so as to unambiguously specify any object in the scene even when the specified object is partially obscured by other objects in the scene.

24. A system for presenting video images of a real-world scene in accordance with a predetermined criterion, the system comprising:
- multiple video cameras, each having an orientation and a lens parameter and a location that is separate from all other video cameras, for producing multiple raw two-dimensional video images of a real-world scene each at a different spatial perspective;
  
  a camera scene builder computer, receiving the multiple raw video images from the multiple video cameras, for producing in consideration of the orientation, the lens parameter, and the location of each of the multiple video cameras, multiple two-dimensional video images of a real-world scene that are of a known spatial relationship, as well as at a different spatial perspective, one to the next;
  
  wherein the spatial positions of all the all the multiple two-dimensional video images of a real-world scene are known;
  
  a viewer interface at which a prospective viewer of the scene may specify a criterion relative to which criterion the viewer wishes to view the scene;
  
  a computer, receiving the multiple two-dimensional video images of the scene from the multiple video imagers and the viewer-specified criterion from the viewer interface,for producing from the multiple two-dimensional video images of the scene a three-dimensional model of the scene; and
  
  for producing from the three-dimensional model a particular two-dimensional video image of the scene in accordance with the viewer-specified criterion; and
  
  a video display, receiving the particular two-dimensional video image of the scene from the computer, for displaying the particular two-dimensional video image of the real-world scene to the viewer.

25. A method of building a three-dimensional video model of a three-dimensional real-world scene, and of extracting video information regarding the real world scene from the model built, the method comprising:
- imaging in multiple video cameras multiple frames of two-dimensional video of the three-dimensional real world scene, the two-dimensional frames from each camera arising from a unique spatial perspective on the scene;
  
  first-analyzing the scene in two dimensions by extracting feature points from the two-dimensional video frames in order to annotate the two-dimensional video frames by certain image information contained therein, thus producing multiple annotated two-dimensional video frames;
  
  second-analyzing in a computer the scene in three dimensions bytransforming the multiple annotated two-dimensional video frames into a three-dimensional video model in which model is contained three-dimensional video of the scene, whileextracting and correlating information from the annotated two-dimensional video frames so as to annotate the three-dimensional video model of the scene with such information, thus producing a three-dimensional video model annotated with scene image information, thus producing an annotated three dimensional video model;
  
  generating in a computer from the annotated three-dimensional video model (i) a two-dimensional virtual video image (ii) synthesized in accordance with some criterion interpretable and interpreted by reference to the scene image information, thus producing a synthesized virtual two-dimensional video image; and
  
  displaying in a display the selected two-dimensional video image;
  
  wherein frames from multiple video cameras were first-analyzed in order to produce the annotated two-dimensional video frames;
  
  wherein the annotated two-dimensional video frames were themselves second-analyzed to produce the annotated three-dimensional video model;
  
  wherein the interpreting, in the generating step, of the criterion by reference to the three-dimensional scene image information is thus, ultimately, an interpretation by reference to scene image information that arose from multiple video cameras;
  
  wherein the image displayed is selected by reference to scene image information that is arose from more than just one video camera, and, indeed, is selected by reference to scene image information that arose from multiple video cameras.

26. A method of building a three-dimensional video model of a three-dimensional real-world scene, and of extracting video information regarding the real world scene from the model built, the method comprising:
- imaging a three-dimensional real world scene having coordinates (x,y,z) by multiple cameras each having reference frame coordinates (p,q,s) that are different than are the camera reference frame coordinates of any other camera so as to produce multiple frames of two-dimensional video each having coordinates (p,q);
  
  first-analyzing the scene in two dimensions by extracting feature points from the two-dimensional video frames in order to annotate the two-dimensional video frames by certain image information contained therein, thus producing multiple annotated two-dimensional video frames, the first-analyzing serving to extract feature points of coordinates (p₀,q₀) from the two-dimensional video frames;
  
  second-analyzing in a computer the scene in three dimensions bytransforming the multiple annotated two-dimensional video frames into a three-dimensional video model in which model is contained three-dimensional video of the scene, particularly by transforming a point (x,y,z) in the world coordinate system to a point (p,q,s) in the camera coordinate system by ##EQU3## where R is a transformation matrix from the world coordinate system to the camera coordinate system, and (x₀,y₀,z₀) is the position of the camera, andby projecting a point (p,q,s) in the camera coordinate system to a point (u,v) on the image plane according by ##EQU4## where f is camera parameter that determines the degree of zoom in or zoom out;
  
  wherein an image coordinate (u,v) that corresponds to world coordinate (x,y,z) is determined depending on the (i) camera position, (ii) camera angle and (ii) camera parameter, whileextracting and correlating information from the annotated two-dimensional video frames so as to annotate the three-dimensional video model of the scene with such information, thus producing a three-dimensional video model annotated with scene image information, thus producing an annotated three dimensional video model;
  
  generating in a computer from the annotated three-dimensional video model (i) a two-dimensional video image (ii) selected in accordance with some criterion interpretable and interpreted by reference to the scene image information, thus producing a selected two-dimensional video image; and
  
  displaying in a display the selected two-dimensional video image;
  
  wherein frames from multiple video cameras were first-analyzed in order to produce the annotated two-dimensional video frames;
  
  wherein the annotated two-dimensional video frames were themselves second-analyzed to produce the annotated three-dimensional video model;
  
  wherein the interpreting, in the generating step, of the criterion by reference to the three-dimensional scene image information is thus, ultimately, an interpretation by reference to scene image information that arose from multiple video cameras;
  
  wherein the image displayed is selected by reference to scene image information that is arose from more than just one video camera, and, indeed, is selected by reference to scene image information that arose from multiple video cameras.
- View Dependent Claims (27, 28)
- - 27. The method according to claim 26 that, a first step, further comprises:
    - calibrating each of the multiple cameras byobserving a known point,knowing thereby the observed point a pair of image coordinates and corresponding world coordinates,applying this known pair to the equations of claim 28 so as to obtain two equations regarding the seven parameters that determine camera status,repeating the observing, the knowing and the applying for at least four known points so as to, the minimum equations to solve the seven unknown parameters thus being provided, solve the equations and calibrate the camera coordinate system (p,q,s) to the world coordinate system (x,y,z).
  - 28. The method according to claim 27wherein the transforming a point (x,y,z) in the world coordinate system to a point (p,q,s) in the camera coordinate system, and the projecting of the point (p,q,s) in the camera coordinate system to a point (u,v) on the image plane, assumes, a simplifying assumption, that all points (u,v) are constrained to lie in a plane.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
California Polytechnic State University (California State University System)
Original Assignee
Regents of the University of California (University of California)
Inventors
Katkere, Arun, Moezzi, Saied, Jain, Ramesh
Primary Examiner(s)
Trammell, James R.
Assistant Examiner(s)
PEESO, THOMAS R

Application Number

US08/667,372
Time in Patent Office

676 Days
Field of Search

364/514 A, 364/410, 395/125, 395/129, 395/119, 395/952, 348/13, 348/19, 348/20, 348/39, 348/42, 348/47, 273/433
US Class Current

382/154
CPC Class Codes

G05B 2219/32014   Augmented reality assists o...

G06T 15/10   Geometric effects

G06T 2207/10012   Stereo images

G06T 2207/10021   Stereoscopic video; Stereos...

H04N 13/117   the virtual viewpoint locat...

H04N 13/139   Format conversion, e.g. of ...

H04N 13/156   Mixing image signals

H04N 13/167   Synchronising or controllin...

H04N 13/189   Recording image signals; Re...

H04N 13/194   Transmission of image signals

H04N 13/243   using three or more 2D imag...

H04N 13/246   Calibration of cameras

H04N 13/257   Colour aspects

H04N 13/279   the virtual viewpoint locat...

H04N 13/289   Switching between monoscopi...

H04N 13/296   Synchronisation thereof; Co...

H04N 13/334   using spectral multiplexing

H04N 13/337   using polarisation multiple...

H04N 13/341   using temporal multiplexing

H04N 13/344   with head-mounted left-righ...

H04N 13/363 : using image projection scre...

H04N 19/597 : specially adapted for multi...

H04N 2013/0081 : Depth or disparity estimati...

H04N 2013/0085 : Motion estimation from ster...

H04N 2013/0092 : Image segmentation from ste...

H04N 2013/0096 : Synchronisation or controll...

H04N 5/222 : Studio circuitry; Studio de...

H04N 5/2627 : for providing spin image ef...

H04N 5/77 : between a recording apparat...

View All

Machine synthesis of a virtual video camera/image of a scene from multiple video cameras/images of the scene in accordance with a particular perspective on the scene, an object in the scene, or an event in the scene

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

28 Claims

Specification

Solutions

Use Cases

Quick Links

Machine synthesis of a virtual video camera/image of a scene from multiple video cameras/images of the scene in accordance with a particular perspective on the scene, an object in the scene, or an event in the scene

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

28 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links