Machine dynamic selection of one video camera/image of a scene from multiple video cameras/images of the scene in accordance with a particular perspective on the scene, an object in the scene, or an event in the scene

US 5,729,471 A
Filed: 03/31/1995
Issued: 03/17/1998
Est. Priority Date: 03/31/1995
Status: Expired due to Term

First Claim

Patent Images

1. A method of presenting to a viewer a particular two-dimensional video image of a real-world three dimensional scene containing an object comprising:

imaging in multiple video cameras each at a different spatial location multiple two-dimensional images of a real-world scene each at a different spatial perspective not all of which scene perspectives may always and invariably show the object in the scene;

combining in a computer the multiple two-dimensional images of the scene into a three-dimensional model of the scene so as to generate a three-dimensional model of the scene in which model the object in the scene is identified;

selecting in the computer from the three-dimensional model a particular two-dimensional image of the scene, corresponding to one of the images of the real-world scene that is imaged by one of the multiple video cameras, showing the object; and

displaying in a video display the particular two-dimensional image of the real-world scene showing the object to the viewer.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Each and any viewer of a video or a television scene is his or her own proactive editor of the scene, having the ability to interactively dictate and select--in advance of the unfolding of the scene and by high-level command--a particular perspective by which the scene will be depicted, as and when the scene unfolds. Video images of the scene are selected, or even synthesized, in response no a viewer-selected (i) spatial perspective on the scene, (ii) static or dynamic object appearing in the scene, or (iii) event depicted in the scene. Multiple video cameras, each at a different spatial location, produce multiple two-dimensional video images of the real-world scene, each at a different spatial perspective. Objects of interest in the scene are identified and classified by computer in these two-dimensional images. The two-dimensional images of the scene, and accompanying information, are then combined in the computer into a three-dimensional video database, or model, of the scene. The computer also receives a user/viewer-specified criterion relative to which criterion the user/viewer wishes to view the scene. From the (i) model and (ii) the criterion, the computer produces a particular two-dimensional image of the scene that is in "best" accordance with the user/viewer-specified criterion. This particular two-dimensional image of the scene is then displayed on a video display. From its knowledge of the scene and of the objects and the events therein, the computer may also answer user/viewer-posed questions regarding the scene and its objects and events.

Citations

30 Claims

1. A method of presenting to a viewer a particular two-dimensional video image of a real-world three dimensional scene containing an object comprising:
- imaging in multiple video cameras each at a different spatial location multiple two-dimensional images of a real-world scene each at a different spatial perspective not all of which scene perspectives may always and invariably show the object in the scene;
  
  combining in a computer the multiple two-dimensional images of the scene into a three-dimensional model of the scene so as to generate a three-dimensional model of the scene in which model the object in the scene is identified;
  
  selecting in the computer from the three-dimensional model a particular two-dimensional image of the scene, corresponding to one of the images of the real-world scene that is imaged by one of the multiple video cameras, showing the object; and
  
  displaying in a video display the particular two-dimensional image of the real-world scene showing the object to the viewer.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The method according to claim 1wherein the combining is so as to generate a three-dimensional model of the scene in which model objects in the scene are identified;
    - wherein the receiving is of the viewer-specified criterion of a selected object that the viewer wishes to particularly view within the scene; and
      
      wherein the selecting in the computer from the three-dimensional model is of a particular two-dimensional image of the selected object in the scene; and
      
      wherein the displaying in the video display is of the particular two-dimensional image of the scene showing the viewer-selected object.
  - 3. The method according to claim 2 wherein the viewer-selected object in the scene is static, and unmoving, in the scene.
  - 4. The method according to claim 2 wherein the viewer-selected object in the scene is dynamic, and moving, in the scene.
  - 5. The method according to claim 2 wherein the viewer selects the object that he or she wishes to particularly view in the scene by act of positioning a cursor on the video display, which cursor unambiguously specifies an object in the scene by an association between the object position and the cursor position in three dimensions and is thus a three-dimensional cursor.
  - 6. The method according to claim 1 performed in real time as television presented to a viewer interactively in accordance with the viewer-specified criterion.
  - 7. The method according to claim 1 applied to a real-world three dimensional scene containing a plurality of objectswherein the imaging in multiple video cameras each at a different spatial location is of multiple two-dimensional images of the real-world scene containing the plurality of objects each at a different spatial perspective;
    - wherein the combining in a computer of the multiple two-dimensional images of the scene into a three-dimensional model of the scene is so as to generate a three-dimensional model of the scene in which model the plurality of objects in the scene are identified;
      
      and wherein, before the selecting, the method further comprises;
      
      receiving in the computer from a prospective viewer of the scene a viewer-specified criterion of a particular one of the plurality of scene objects relative to which particular one object the viewer wishes to view the scene;
      
      wherein the selecting in the computer from the three-dimensional model is of a particular two-dimensional image of the scene, corresponding to one of the images of the real-world scene that is imaged by one of the multiple video cameras, showing the viewer-selected object; and
      
      wherein the displaying in a video display the particular two-dimensional image of the real-world scene showing the viewer-selected object to the viewer.

8. A method of presenting to a viewer a particular two-dimensional video image of a real-world three dimensional scene containing an object, the method comprising:
- imaging in multiple video cameras each at a different spatial location multiple two-dimensional images of the real-world scene containing the object each at a different spatial perspective;
  
  combining in a computer the multiple two-dimensional images of the scene into a three-dimensional model of the scene containing the scene object;
  
  receiving in the computer from a prospective viewer of the scene a viewer-specified particular spatial perspective, relative to which particular spatial perspective the viewer wishes to view the object in the scene;
  
  selecting in the computer from the three-dimensional model a particular two-dimensional image of the scene corresponding to one of the images of the real-world scene that is imaged by one of the multiple video cameras in accordance with the particular spatial perspective received from the viewer, this selected image being an actual image of the scene, out of all the actual images of the scene as were imaged by all the multiple video cameras, that is most closely shows the object in accordance with the particular spatial perspective criterion received from the viewer; and
  
  displaying in a video display the particular two-dimensional image of the real-world scene showing the object at the desired spatial perspective to the viewer.
- View Dependent Claims (9, 10)
- - 9. The method according to claim 8wherein the selecting is, over time, of plural actual images of the scene as are imaged, over time, by plural ones of the multiple video cameras;
    - wherein the computer does not invariably select from the three-dimensional model an image that arises from one only of the multiple video cameras, but instead selects plural images as arise over time from plural ones of the multiple video cameras.
  - 10. The method of presenting to a viewer a particular two-dimensional video image of a real-world three dimensional scene according to claim 8 applied to a scene containing a moving objectwherein the imaging in multiple video cameras each at a different spatial location multiple two-dimensional images is of the real-world scene containing the moving object each at a different spatial perspective;
    - wherein the combining in a computer of the multiple two-dimensional images of the scene is into a three-dimensional model of the scene containing the moving object;
      
      wherein the receiving in the computer from the prospective viewer of the scene is of a viewer-specified particular spatial perspective relative to which particular spatial perspective the viewer wishes to view the moving object in the scene;
      
      wherein the selecting in the computer from the three-dimensional model is of a particular two-dimensional image of the scene corresponding to one of the images of the real-world scene that is imaged by one of the multiple video cameras in accordance with the particular spatial perspective received from the viewer, this selected image being an actual image of the scene, out of all the actual images of the scene as were imaged by all the multiple video cameras, that is most closely shows the moving object in accordance with the particular spatial perspective criterion received from the viewer; and
      
      wherein the displaying in a video display the particular two-dimensional image of the real-world scene showing the moving object at the desired spatial perspective to the viewer.

11. A method of presenting a particular two-dimensional video image of a real-world three dimensional scene to a viewer comprising:
- imaging in multiple video cameras each at a different spatial location multiple two-dimensional images of a real-world scene each at a different spatial perspective;
  
  combining in a computer the multiple two-dimensional images of the scene into a three-dimensional model of the scene so as to generate a three-dimensional model of the scene in which model events in the scene are identified;
  
  receiving in the computer from a prospective viewer of the scene a viewer-specified criterion of a selected event that the viewer wishes to particularly view the scene;
  
  selecting in the computer from the three-dimensional model in accordance with the viewer-specified criterion a particular two-dimensional image of the scene, corresponding to one of the images of the real-world scene that is imaged by one of the multiple video cameras, showing the viewer-selected event; and
  
  displaying in a video display the particular two-dimensional image of the real-world scene showing the viewer-selected event to the viewer.
- View Dependent Claims (12)
- - 12. The method according to claim 11 wherein the viewer selects the event that he or she wishes to particularly view in the scene by act of positioning a cursor on the video display, which cursor unambiguously specifies an event in the scene by an association between the event position and the cursor position in three dimensions and is thus a three-dimensional cursor.

13. A method of selecting a video image showing a one object from multiple real video images obtained by a multiplicity of real video cameras showing a scene containing multiple objects, the method comprising:
- storing in a video image database the real two-dimensional video images of the scene containing multiple objects as the video images arise from each of a multiplicity of real video cameras;
  
  creating in a computer from the multiplicity of stored two-dimensional video images a three-dimensional video database containing a three-dimensional video image of the scene;
  
  selecting in the computer a real two-dimensional video image of the scene showing the one object from the three-dimensional video database; and
  
  displaying the selected real two-dimensional video image.
- View Dependent Claims (14, 15, 16, 17, 18, 19, 20, 21, 22)
- - 14. The method according to claim 13 wherein the generating comprises:
    - synthesizing from the three-dimensional video database a two-dimensional virtual video image of the scene that is without correspondence to any real two-dimensional video image of a scene.
  - 15. The method according to claim 13 further comprising:
    - receiving in the computer a criterion of a spatial perspective, which spatial perspective is not that of any of the multiplicity of real video cameras, on the scene as is imaged within the three-dimensional video database;
      
      wherein the selecting of the two-dimensional virtual video image is so as to best approximate showing the one object in the scene from the received spatial perspective.
  - 16. The method according to claim 15 wherein the received spatial perspective is static, and fixed, during the video of the scene.
  - 17. The method according to claim 15 wherein the received spatial perspective is dynamic, and variable, during the video of the scene.
  - 18. The method according to claim 15 wherein the received spatial perspective is so dynamic and variable dependent upon occurrences in the scene.
  - 19. The method according to claim 13 that, between the creating and the selecting, further comprises:
    - locating a selected object in the scene as is imaged within the three-dimensional video database;
      
      wherein the selecting of the two-dimensional virtual video image is so as to best show the selected object.
  - 20. The method according to claim 13 that, between the creating and the selecting, further comprises:
    - dynamically tracking the scene as is imaged within the three-dimensional video database in order to recognize any occurrence of a predetermined event in the scene;
      
      wherein the selecting of the two-dimensional virtual video image is so as to best show the predetermined event.
  - 21. The method according to claim 13 wherein the selecting of the two-dimensional virtual video image is on demand.
  - 22. The method according to claim 13 wherein the selecting of the two-dimensional video image is in real time on demand, thus interactive television.

23. A system for presenting video images of a real-world scene containing a plurality of objects in accordance with a predetermined criterion, the system comprising:
- multiple video imagers each at a different spatial location for producing multiple two-dimensional video images of the real-world scene each at a different spatial perspective;
  
  a viewer interface at which a prospective viewer of the scene may specify a criterion designating a particular one of the plurality of objects relative to which particular one object in the scene the viewer wishes to view the scene;
  
  a computer, receiving the multiple two-dimensional video images of the scene from the multiple video imagers and the viewer-specified criterion from the viewer interface,for producing from the multiple two-dimensional video images of the scene a three-dimensional model of the scene; and
  
  for selecting from the three-dimensional model a particular two-dimensional video image of the scene showing the viewer-selected object in accordance with the viewer-specified criterion; and
  
  video display, receiving the particular two-dimensional video image of the scene from the computer, for displaying the particular two-dimensional video image of the real-world scene showing the viewer-selected object to the viewer.
- View Dependent Claims (24, 25)
- - 24. The system according to claim 23 wherein the multiple video imagers comprise:
    - multiple video cameras, each having an orientation and a lens parameter and a location that is separate from all other video cameras, each for producing a raw video image; and
      
      a camera scene builder computer, receiving the multiple raw video images from the multiple video cameras, for selecting in consideration of the orientation, the lens parameter, and the location of each of the multiple video cameras, two-dimensional video images of a real-world scene that are of a known spatial relationship, as well as at a different spatial perspective, one to the next;
      
      wherein the spatial positions of all the all the multiple two-dimensional video images of a real-world scene are known.
  - 25. The system according to claim 23wherein the viewer interface has and presents a three-dimensional cursor manipulatable by a prospective viewer of the scene so as to unambiguously specify any object in the scene even when the specified object is partially obscured by other objects in the scene.

26. A method of building a three-dimensional video model of a three-dimensional real-world scene, and of extracting video information regarding the real world scene from the model built, the method comprising:
- imaging in multiple video cameras multiple frames of two-dimensional video of the three-dimensional real world scene, the two-dimensional frames from each camera arising from a unique spatial perspective on the scene;
  
  first-analyzing the scene in two dimensions by extracting feature points from the two-dimensional video frames in order to annotate the two-dimensional video frames by certain image information contained therein, thus producing multiple annotated two-dimensional video frames;
  
  second-analyzing in a computer the scene in three dimensions bytransforming the multiple annotated two-dimensional video frames into a three-dimensional video model in which model is contained three-dimensional video of the scene, whileextracting and correlating information from the annotated two-dimensional video frames so as to annotate the three-dimensional video model of the scene with such information, thus producing a three-dimensional video model annotated with scene image information, thus producing an annotated three dimensional video model;
  
  selecting in a computer from the annotated three-dimensional video model (i) a two-dimensional video image (ii) in accordance with some criterion interpretable and interpreted by reference to the scene image information, thus producing a selected two-dimensional video image; and
  
  displaying in a display the selected two-dimensional video image;
  
  wherein frames from multiple video cameras were first-analyzed in order to produce the annotated two-dimensional video frames;
  
  wherein the annotated two-dimensional video frames were themselves second-analyzed to produce the annotated three-dimensional video model;
  
  wherein the interpreting, in the selecting step, of the criterion by reference to the three-dimensional scene image information is thus, ultimately, an interpretation by reference to scene image information that arose from multiple video cameras;
  
  wherein the image displayed is selected by reference to scene image information that is arose from more than just one video camera, and, indeed, is selected by reference to scene image information that arose from multiple video cameras.
- View Dependent Claims (27, 28, 29)
- - 27. The method according to claim 26wherein the imaging is of the three-dimensional real world scene having coordinates (x,y,z) by multiple cameras each having reference frame coordinates (p,q,s) that are different than are the camera reference frame coordinates of any other camera so as to produce multiple frames of two-dimensional video each having coordinates (p,q);
    - wherein the first-analyzing extracts feature points of coordinates (p₀,q₀) from the two-dimensional video frames;
      
      wherein the second-analyzing serves to produce the three-dimensional video model of the sceneby transforming a point (x,y,z) in the world coordinate system to a point (p,q,s) in the camera coordinate system by ##EQU3## where R is a transformation matrix from the world coordinate system to the camera coordinate system, and (x₀,y₀,z₀) is the position of the camera, andby projecting a point (p,q,s) in the camera coordinate system to a point (u,v) on the image plane according by ##EQU4## where f is camera parameter that determines the degree of zoom in or zoom out;
      
      wherein an image coordinate (u,v) that corresponds to world coordinate (x,y,z) is determined depending on the (i) camera position, (ii) camera angle and (ii) camera parameter.
  - 28. The method according to claim 27 that, a first step, further comprises:
    - calibrating each of the multiple cameras byobserving a known point,knowing thereby the observed point a pair of image coordinates and corresponding world coordinates,applying this known pair to the equations of claim 28 so as to obtain two equations regarding the seven parameters that determine camera status,repeating the observing, the knowing and the applying for at least four known points so as to, the minimum equations to solve the seven unknown parameters thus being provided, solve the equations and calibrate the camera coordinate system (p,q,s) to the world coordinate system (x,y,z).
  - 29. The method according to claim 27wherein the transforming a point (x,y,z) in the world coordinate system to a point (p,q,s) in the camera coordinate system, and the projecting of the point (p,q,s) in the camera coordinate system to a point (u,v) on the image plane, assumes, a simplifying assumption, that all points (u,v) are constrained to lie in a plane.

30. A method of presenting to a viewer a particular two-dimensional video image of a real-world three dimensional scene containing a moving object, the method comprising:
- imaging in multiple video cameras each at a different spatial location multiple two-dimensional images of the real-world scene each at a different spatial perspective, not all of which different scene perspectives always and invariably show the object as it moves;
  
  combining in a computer the multiple two-dimensional images of the scene into a three-dimensional model of the scene containing the scene'"'"'s moving object;
  
  selecting in the computer from the three-dimensional model a particular two-dimensional image of the scene that, out of all the actual images of the scene as were imaged by all the multiple video cameras, most closely shows the moving object; and
  
  displaying in a video display the particular two-dimensional image of the real-world scene showing the moving object.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Regents of the University of California (University of California)
Original Assignee
Regents of the University of California (University of California)
Inventors
Wakimoto, Koji, Jain, Ramesh
Primary Examiner(s)
Trammell, James P.
Assistant Examiner(s)
PEESO, THOMAS R

Application Number

US08/414,437
Time in Patent Office

1,082 Days
Field of Search

364/514 A, 364/410, 395/125, 395/129, 395/119, 395/155, 348/13, 348/19, 348/20, 348/39, 348/42, 348/47, 348/51, 273/433, 273/444
US Class Current

725/131
CPC Class Codes

G05B 2219/32014   Augmented reality assists o...

G06T 15/10   Geometric effects

G06T 2207/10012   Stereo images

G06T 2207/10021   Stereoscopic video; Stereos...

H04N 13/117   the virtual viewpoint locat...

H04N 13/139   Format conversion, e.g. of ...

H04N 13/156   Mixing image signals

H04N 13/167   Synchronising or controllin...

H04N 13/189   Recording image signals; Re...

H04N 13/194   Transmission of image signals

H04N 13/243   using three or more 2D imag...

H04N 13/246   Calibration of cameras

H04N 13/257   Colour aspects

H04N 13/279   the virtual viewpoint locat...

H04N 13/289   Switching between monoscopi...

H04N 13/296   Synchronisation thereof; Co...

H04N 13/334   using spectral multiplexing

H04N 13/337   using polarisation multiple...

H04N 13/341   using temporal multiplexing

H04N 13/344   with head-mounted left-righ...

H04N 13/363 : using image projection scre...

H04N 19/597 : specially adapted for multi...

H04N 2013/0081 : Depth or disparity estimati...

H04N 2013/0085 : Motion estimation from ster...

H04N 2013/0092 : Image segmentation from ste...

H04N 2013/0096 : Synchronisation or controll...

H04N 5/222 : Studio circuitry; Studio de...

H04N 5/2627 : for providing spin image ef...

H04N 5/77 : between a recording apparat...

View All

Machine dynamic selection of one video camera/image of a scene from multiple video cameras/images of the scene in accordance with a particular perspective on the scene, an object in the scene, or an event in the scene

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

30 Claims

Specification

Solutions

Use Cases

Quick Links

Machine dynamic selection of one video camera/image of a scene from multiple video cameras/images of the scene in accordance with a particular perspective on the scene, an object in the scene, or an event in the scene

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

30 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links