AUGMENTED REALITY IMAGE DISPLAY SYSTEMS AND METHODS
1. A system, comprising:
- a visual input sensor, wherein the visual input sensor is operable to capture a live stream video of a patient;
an image interface, wherein the image interface is operable to receive image data of the patient from a plurality of different imaging devices;
at least one visual display device;
at least one processor; and
at least one non-transitory computer-readable storage medium, wherein the at least one non-transitory computer-readable storage medium stores one or more processor-executable instructions that, when executed by the at least one processor;
receive the live stream video of the patient,receive the image data of the patient,determine an orientation of the patient from the live video stream,generate a video output of the live video stream for outputting via the at least one visual display device, andoverlay the image data of the patient in the video output, wherein the overlaid image data is aligned with the orientation of the patient.
This invention is an augmented reality device that displays virtual screens at user specified positions taking inputs from the electronic medical record, PACs system, and imaging device. The virtual screens can be displayed in any orientation. Machine learning tools will improve ease of workflow. Collaboration tools will be available. An API will ease interoperability between radiology imaging devices and augmented reality systems. Devices such as catheters/needles impregnated with radiopaque markers and other radiopaque target markers are also discussed.
- 1. A system, comprising:
a visual input sensor, wherein the visual input sensor is operable to capture a live stream video of a patient; an image interface, wherein the image interface is operable to receive image data of the patient from a plurality of different imaging devices; at least one visual display device; at least one processor; and at least one non-transitory computer-readable storage medium, wherein the at least one non-transitory computer-readable storage medium stores one or more processor-executable instructions that, when executed by the at least one processor; receive the live stream video of the patient, receive the image data of the patient, determine an orientation of the patient from the live video stream, generate a video output of the live video stream for outputting via the at least one visual display device, and overlay the image data of the patient in the video output, wherein the overlaid image data is aligned with the orientation of the patient.
- View Dependent Claims (2, 3, 4, 5)
- 6. A method, comprising:
capturing a live stream video of a patient via at least one visual input sensor; receiving image data of the patient from a plurality of different imaging devices; determining an orientation of the patient from the live video stream; generating a video output of the live video stream; overlaying the image data of the patient in the video output, wherein the overlaid image data is aligned with the orientation of the patient; and outputting, via at least one visual display device, the video output and overlaid image data.
- View Dependent Claims (7, 8, 9, 10)
- 11. A non-transitory computer readable storage medium comprising a set of instructions executable by a computer, the non-transitory computer readable storage medium comprising:
instructions for capturing a live stream video of a patient via at least one visual input sensor; instructions for receiving image data of the patient from a plurality of different imaging devices; instructions for determining an orientation of the patient from the live video stream; instructions for generating a video output of the live video stream; instructions for overlaying the image data of the patient in the video output, wherein the overlaid image data is aligned with the orientation of the patient; and instructions for outputting, via at least one visual display device, the video output and overlaid image data.
- View Dependent Claims (12, 13, 14, 15)
This application claims priority to U.S. Provisional Patent Application Ser. No. 62/655,274, filed on Apr. 10, 2018, entitled “AUGMENTED REALITY IMAGE DISPLAY SYSTEM,” currently pending, the entire disclosure of which is incorporated herein by reference.
Minimally invasive surgical techniques such as laparoscopy, endovascular and interventional radiology have obvious benefits in minimizing tissue damage. The main problems exist in disruption of the feedback loops to the operator. Visual feedback is obstructed by the patient'"'"'s tissue and so the operator is unable to directly view the surgical site.
Current methods for visualizing the surgical site include cameras or noninvasive imaging modalities, most commonly ultrasound and fluoroscopy. These devices output visual information to a screen that is either physically attached to the imaging device (e.g., ultrasound), or is displayed on a screen attached to a boom. While the resolution of these screens can be satisfactory there are numerous limitations to these screens.
First, the screens are not in the same orientation as the patient. In endovascular procedures the patient lies horizontally on the procedure table while the screen is upright. Thus the operator views an upright screen while intervening on a horizontal patient. This causes increased mental effort to geometrically orient the image onto the patient as well as context switching as the operator must constantly reorient themselves toward the patient which in it of itself increases task time and decreases accuracy. Moreover, the large distance between the screen and the patient also leads to eye accommodation and conversion leading to visual fatigue. The combination of these factors adversely impacts task performance.
A second issue is ergonomics. Work related musculoskeletal disorders receive little attention, but can cause significant morbidity. When screens are attached to the machines themselves they are constrained by limited space in the operating room. This can lead to poor placement of the screens. When screens are attached to a boom often they cannot be lowered sufficiently. In addition, with multiple people in the room not everyone'"'"'s preferences can be accommodated. Musculoskeletal conditions for operators can lead to significant morbidity and economic cost from lost workdays. These can in turn impact patient welfare.
Surgical operating theaters are sterile environments to limit the risk of infection and harm to the patient. The process of interacting with the imaging information however is not sterile. Thus, operators are unable to access crucial patient and imaging information. They must either ask another person in the room to display the information or they must un-glove, break the sterile field, and then re-glove. This process is time consuming, costly, and can be ultimately harmful to the patient. Another unintended consequence of the mandate to maintain sterility is that collaboration between surgeons and surgical staff is often limited. Since there is no way to interact with the imaging devices/information without breaking sterility, operators using conventional techniques require workarounds that cause distraction and context switching, but more seriously can risk inadvertent voiding of sterility which can cause increased infection.
Categories of conventional systems include projector-based, boom- and/or gantry-based, and headset displays. Conventional projector-based systems use a mounted video projector that displays images onto the patient. Conventional boom- and/or gantry-based devices use a monitor that is positioned over the patient that can be used for surgical planning or intraoperative guidance. Conventional headset-based devices use a video see-through display. However, these conventional systems are mostly static and still require patient information and surgical data to be displayed on other screens. A barrier to widespread adoption of the above techniques is that their operation is not intuitively integrated into the end user'"'"'s workflow. Image registration and manipulation must be accomplished pre-operatively or by a non-sterile technologist intraoperatively. Using projectors or screens mounted to gantries are cumbersome enough to prevent regular use.
An aspect of the invention includes an augmented reality (AR) system that creates interactive virtual screens for surgical collaboration. The system interfaces with the image output from the imaging device and displays the information on a virtual screen. The system is operable to receive multiple inputs from different imaging devices and integrate them into an augmented workspace for the operator. The system is operable to be manipulated (e.g., by an operator) via techniques that do not require touch. In exemplary embodiments, the system is manipulated using one or more of eye, gestures, speech, and other methods of control. In further exemplary embodiments, the system enables collaboration (e.g., when multiple operators are in the same room) via annotations. In yet further exemplary embodiments, the system includes machine-learning models and/or other artificial intelligence techniques to guide the operator, such as medical registration and segmentation, and image processing, for example.
In the accompanying drawings, which form a part of the specification and are to be read in conjunction therewith in which like reference numerals are used to indicate like or similar parts in the various views:
Appendix A provides one exemplary embodiment of details regarding image processing.
In the following detailed description of example embodiments, reference is made to the accompanying drawings that form a part hereof, and in which is shown by way of illustration specific example embodiments in which the inventive subject matter may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the inventive subject matter, and it is to be understood that other embodiments may be utilized and that logical, mechanical, electrical and other changes may be made without departing from the scope of the inventive subject matter.
Some portions of the detailed descriptions which follow are presented in terms of algorithms and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the ways used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is here, and generally, conceived to be a self-consistent sequence of steps leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like. It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussions, terms such as “processing” or “computing” or “calculating” or “determining” or “displaying” or the like, refer to the action and processes of a computer system, or similar computing device, that manipulates and transforms data represented as physical (e.g., electronic) quantities within the computer system'"'"'s registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.
The description of the various embodiments is to be construed as describing examples only and does not describe every possible instance of the inventive subject matter. Numerous alternatives could be implemented, using combinations of current or future technologies, which would still fall within the scope of the claims. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the inventive subject matter is defined only by the appended claims.
In some aspects, the systems, methods, and techniques described herein provide virtual interactive and dynamic screens that display vital information and can be placed at any user selected position. In an exemplary embodiment, the systems include a touchless user interface that may include eye tracking, gesture control, and/or voice control (e.g., to maintain sterility). The systems, methods, and techniques described herein include a control system that does not interfere with the operator'"'"'s ability to perform the procedure or surgery. In a further exemplary embodiment, the systems include one or more interactive virtual screens that can be shared among users/operators in a procedural suite or operating room because teams perform these procedures. In yet further exemplary embodiments, the systems, methods, and techniques described herein include collaboration tools (e.g., for preoperative and/or intraoperative planning, etc.).
Conventional systems are specific to a certain procedure and cannot be generalized. For specialties such as interventional radiology, however, operators perform procedures in multiple anatomical locations and it is necessary to have an AR system that can be adapted to use in other areas. In some exemplary embodiments, the systems, methods, and techniques described herein are adaptable to usage in a plurality of procedures, locations, areas, and the like.
Additionally, specialties such as interventional radiology often use multiple imaging modalities from multiple vendors to accomplish a specific procedure. In some exemplary embodiments, the AR systems, methods, and techniques described herein are configured to accommodate multiple inputs simultaneously and allow users to manipulate them in a seamless way.
In further embodiments, the systems, methods, and techniques described herein enable interoperability for medical apps. In an exemplary embodiment, the systems, methods, and techniques described herein include an application-programming interface (API) that enables an application for a specific headset or medical device to be ported to another type of headset or medical device.
In yet further embodiments, the systems, methods, and techniques described herein are operable to analyze information recorded during a procedure (e.g., images, radiation exposure data, power output data, etc.) in clinically and/or operationally useful ways.
Exemplary sensors 102 include sensors that are operable to input visual feedback (e.g., a camera, etc.), haptic feedback (e.g., a touchscreen device, etc.), as well as imaging data into the processor(s) 106 (e.g., centralized processing server, etc.). A sensor is a broad term that is intended to encompass its plain and ordinary meaning including without limitation cameras, infrared cameras, stereoscopic cameras, heartbeat sensor, touch sensor, humidity sensor, gas sensor, smoke sensor, thermistor, ultrasonic sensor, etc.
The image data interface 104 is operable to receive imaging data from one or more imaging devices (e.g., X-ray devices, magnetic resonance imaging (MM) devices, etc.) and/or computer-readable media devices on which imaging data is stored and communicate the imaging data to the processor(s) 106. Imaging data is also intended to be used in its plain and ordinary meaning to include raw source data directly from the machine or as video output obtained through a frame grabber or streamed via camera uplink.
In some embodiments, the processor(s) 106 comprise the visual display devices 110. In other embodiments, the visual display device(s) 110 are tethered to the processor(s) 106 in a wired and/or wireless fashion. The central processing unit may have multiple graphical processing unit cores or any other configuration. Visual display is a broad term that is intended to encompass its plain and ordinary meaning. In some embodiments the visual display may be a head mounted device (camera pass through device, optical see through device), or a display system in some other form.
In an embodiment, image registration occurs using an image target (also referred to as a target marker in some embodiments) that functions as a fiducial marker for emplacement of the image or video. Emplacement may refer to pose, position, orientation or any other appropriate locational information. There can be a single image target or multiple image targets. In some embodiments, this image target may include but is not limited to a radioopaque sticker, lenticular array, or other object. In other embodiments, the image target may include anatomical markers which include but are not limited to the nipples, umbilicus, spine bones, ribs, shoulder, hips, pelvis, femur, or any other appropriate anatomical marker. Additionally, an image target is not necessary for emplacement, and emplacement may occur using other forms. Emplacement may also arise from any other number of tracking methods including but not limited to electromagnetic field. Other embodiments may include and are not limited to video feed streamed to a web server (e.g., through a WebRTC protocol, etc.) that is then processed on the device used by the operator 34 (e.g., processor(s) 106 and/or headset 6, etc.) to display the images. The video feed may be streamed to a local server computing device or directly fed into the device used by the operator 34. The inputs 8 may be wireless or physical wired connections. The tether connecting headset 6 and processor(s) 106 may be wired or wireless.
In this embodiment the virtual screen 9 live streams the image output from the imaging machine 5. In an embodiment, a virtual screen is an augmented reality object that displays information. The streaming may be via methods including, but not limited to, using a real time streaming protocol (RTSP) over a WebRTC, HTTP Live streaming, or any other image and/or video streaming protocol. Additionally, the stream may be hardwired to the processor(s) 106 and then streamed to the virtual object. Virtual screens 10 and 11 are exemplary embodiments of other uses for screens that include but are not limited to the electronic medical record and/or prior imaging. The differences between virtual screens 9, 10, and 11, are the type of content within the screens as well as the orientation angle.
In an embodiment, the position-orientation determination program is implemented in Python. Video capture card is used to stream the X-ray video from the imaging device to the computer. The streamed X-ray frames are read into Python and analyzed by the ML model through Python interface of TensorFlow. The results are merged with the original frame for visualization. TensorFlow is not necessary for this specific model. Nevertheless, it makes it trivial to import other readily available ML models for real time analysis.
An example of a ML algorithm based on the framework in
In an example of a method of determining the orientation from such a marker,
Parallel projection is used as an example, as shown in
Parallel projection is simple but loses the depth information. Unparallel projections can also be used to encode depth (z) information. Then (φ, ω) should be replaced by (z, φ, ω). More markers can be used and the same simulation could be done to build the mapping from spatial information to areas. The mapping will be used to recover spatial information in practice.
This process can be aided by taking other projections or two-dimensional images from known angles and correlating the data. Other ways to calculate the orientation of the device include but are not limited to means of measuring section widths, heights, volumes, areas, identifying special features on the markers, or through comparing multiple projection combinations of such listed means.
With reference to
The example computer system 2000 may include at least one processor 2002 (e.g., a central processing unit (CPU), a graphics processing unit (GPU), or both), a main memory 2004 and a static memory 2006, which communicate with each other via a bus 2008. In some embodiments, processor 2002 may comprise, in whole or in part, processor(s) 106. In some embodiments, main memory 2004 may comprise, in whole or in part, data recording and storage device(s) 108. The computer system 2000 may further include a touchscreen display unit 2010. In example embodiments, the computer system 2000 also includes a network interface device 2020.
The persistent storage unit 2016 includes a machine-readable medium 2022 on which is stored one or more sets of instructions 2024 and data structures (e.g., software instructions) embodying or used by any one or more of the methodologies or functions described herein. The instructions 2024 may also reside, completely or at least partially, within the main memory 2004 or within the processor 2002 during execution thereof by the computer system 2000, the main memory 2004 and the processor 2002 also constituting machine-readable media. In some embodiments, instructions 2024 comprise, in whole or in part, the ML algorithm of
While the machine-readable medium 2022 is shown in an example embodiment to be a single medium, the term “machine-readable medium” may include a single medium or multiple media (e.g., a centralized or distributed database, or associated caches and servers) that store the one or more instructions. The term “machine-readable medium” shall also be taken to include any tangible medium that is capable of storing, encoding, or carrying instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies of embodiments of the present invention, or that is capable of storing, encoding, or carrying data structures used by or associated with such instructions. The term “machine-readable storage medium” shall accordingly be taken to include, but not be limited to, solid-state memories and optical and magnetic media that can store information in a non-transitory manner, i.e., media that is able to store information. Specific examples of machine-readable storage media include non-volatile memory, including by way of example semiconductor memory devices (e.g., Erasable Programmable Read-Only Memory (EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), and flash memory devices); magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks. A machine-readable storage medium does not include signals.
The instructions 2024 may further be transmitted or received over a communications network 2026 using a signal transmission medium via the network interface device 2020 and utilizing any one of a number of well-known transfer protocols (e.g., FTP, HTTP). Examples of communication networks include a local area network (LAN), a wide area network (WAN), personal area network (PAN), wireless personal area network (WPAN), the Internet, mobile telephone networks, Plain Old Telephone (POTS) networks, and wireless data networks (e.g., WiFi and WiMax networks). The term “machine-readable signal medium” shall be taken to include any transitory intangible medium that is capable of storing, encoding, or carrying instructions for execution by the machine, and includes digital or analog communications signals or other intangible medium to facilitate communication of such software. In some embodiments, communications network 2026 comprises, in whole or in part, the electrical and/or communicative couplings described herein.
From the foregoing, it will be seen that this invention is one well adapted to attain all the ends and objects hereinabove set forth together with other advantages which are obvious and which are inherent to the structure. It will be understood that certain features and sub combinations are of utility and may be employed without reference to other features and sub combinations. This is contemplated by and is within the scope of the claims. Since many possible embodiments of the invention may be made without departing from the scope thereof, it is also to be understood that all matters herein set forth or shown in the accompanying drawings are to be interpreted as illustrative and not limiting.
The constructions described above and illustrated in the drawings are presented by way of example only and are not intended to limit the concepts and principles of the present invention. Thus, there has been shown and described several embodiments of a novel invention. As is evident from the foregoing description, certain aspects of the present invention are not limited by the particular details of the examples illustrated herein, and it is therefore contemplated that other modifications and applications, or equivalents thereof, will occur to those skilled in the art. The terms “having” and “including” and similar terms as used in the foregoing specification are used in the sense of “optional” or “may include” and not as “required”. Many changes, modifications, variations and other uses and applications of the present construction will, however, become apparent to those skilled in the art after considering the specification and the accompanying drawings. All such changes, modifications, variations and other uses and applications which do not depart from the spirit and scope of the invention are deemed to be covered by the invention which is limited only by the claims which follow.