Voice-responsive annotation of video generated by an endoscopic camera

US 8,443,279 B1
Filed: 10/13/2004
Issued: 05/14/2013
Est. Priority Date: 10/13/2004
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving a video stream generated by an endoscopic video camera;

receiving and automatically recognizing, by a voice-responsive control system, a spoken utterance of a user while the video stream is being received, wherein the spoken utterance includes a predefined command and additional speech, the voice-responsive control system looks up a non-text annotation corresponding to the additional speech and in response to recognizing the predefined command;

sending, from the voice-responsive control system to an image capture device, a control packet including an indication that the annotation is a non-text visual object, an index of the annotation, and display coordinates for the annotation;

providing, by the image capture device, the video stream and the annotation to a display device for display, such that the annotation is overlaid on a frame of the video stream displayed on the display device at the display coordinates specified by the control packet to point to or outline an anatomical feature; and

associating, by the image capture device, the annotation with the video stream.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An image capture device in an endoscopic imaging system receives a video stream generated by an endoscopic video camera. In response to automatic recognition of a spoken utterance while the video stream is being received from the endoscopic video camera, the image capture device associates with the video stream an annotation that corresponds to the spoken utterance. The image capture device provides the video stream to a display device for display, such that the annotation can be overlaid on one or more frames of the video stream displayed on the display device.

Citations

28 Claims

1. A method comprising:
- receiving a video stream generated by an endoscopic video camera;
  
  receiving and automatically recognizing, by a voice-responsive control system, a spoken utterance of a user while the video stream is being received, wherein the spoken utterance includes a predefined command and additional speech, the voice-responsive control system looks up a non-text annotation corresponding to the additional speech and in response to recognizing the predefined command;
  
  sending, from the voice-responsive control system to an image capture device, a control packet including an indication that the annotation is a non-text visual object, an index of the annotation, and display coordinates for the annotation;
  
  providing, by the image capture device, the video stream and the annotation to a display device for display, such that the annotation is overlaid on a frame of the video stream displayed on the display device at the display coordinates specified by the control packet to point to or outline an anatomical feature; and
  
  associating, by the image capture device, the annotation with the video stream.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
- - 2. A method as recited in claim 1, further comprising recording the video stream, wherein associating the annotation with at least a portion of the video stream comprises associating the annotation with at least a portion of the recorded video stream.
  - 3. A method as recited in claim 2, further comprising:
    - in response to a predetermined input, identifying a set of one or more frames or sections of video in the recorded video stream, each of which has an annotation previously associated therewith; and
      
      generating an album display in which each frame in the set is displayed as a thumbnail image in proximity with the associated annotation.
  - 4. A method as recited in claim 3, further comprising enabling a user to initiate playback of a segment of the recorded video stream from the album display by inputting a user input relating to a thumbnail image in the album display.
  - 5. A method as recited in claim 3, further comprising enabling a user to edit one of the annotations from the album display.
  - 6. A method as recited in claim 2, further comprising:
    - inputting a search term specified by a user;
      
      searching a set of stored annotations associated with the recorded video stream for the search term; and
      
      if an annotation corresponding to the search term is found in the set of stored annotations, causing a visual representation of a segment of the recorded video stream associated with said annotation to be displayed to the user.
  - 7. A method as recited in claim 1, further comprising:
    - associating a second annotation that corresponds to a second spoken utterance with at least a second portion of the video stream, wherein the second annotation comprises a text object.
  - 8. A method as recited in claim 1, wherein the non-text visual object is a pointer or hollow shape.
  - 9. A method as recited in claim 1, further comprising:
    - associating a second annotation that corresponds to a second spoken utterance with at least a second portion of the video stream, wherein the second annotation comprises an audio object.
  - 10. A method as recited in claim 1, wherein the live video stream has no embedded audio, other than the annotation or other annotations similar to said annotation.
  - 11. A method as recited in claim 1, wherein associating the annotation with the video stream comprises storing the annotation in a closed-caption portion of a frame of the video stream.
  - 12. A method as recited in claim 1, wherein associating the annotation with the video stream comprises embedding the annotation in a video portion of a frame of the video stream.
  - 13. A method as recited in claim 1, wherein associating the annotation with the video stream comprises appending the annotation to the end of the video stream.
  - 14. A method as recited in claim 1, further comprising:
    - outputting to an end user a context-sensitive list of user-selectable annotations usable for annotating the video, wherein said annotation is selected from the list by the user.
  - 15. A method as recited in claim 14, wherein the context-sensitive list is configurable to include annotations specified by the end user.
  - 16. A method as recited in claim 1, further comprising:
    - storing a first dictionary containing an association of indexes and phonemes, wherein the index is identified from the first dictionary; and
      
      storing a second annotation dictionary containing an association of indexes and annotations, wherein the annotation is retrieved from the second dictionary based on the index.
  - 17. A method as recited in claim 16, wherein at least one of the annotations is defined by an end user of the endoscopic system.

18. An apparatus comprising:
- a voice-responsive control system toreceive a video stream generated by an endoscopic video camera;
  
  receive and automatically recognize a spoken utterance of a user while the video stream is being received, wherein the spoken utterance includes a predefined command and additional speech, the voice-responsive control system including an annotation dictionary to store a set of annotations,look up, in the annotation dictionary, a non-text annotation corresponding to the additional speech in response to recognizing the predefined commandgenerate a control packet including an indication that the annotation is a non-text visual object, an index of the annotation, and display coordinates for the annotation; and
  
  an image capture device toreceive the control packet and the video stream from the voice-responsive control system,provide the video stream and the annotation to a display device, such that the annotation is overlaid on a frame of the video stream displayed on the display device at the display coordinates specified by the control packet to point to or outline an anatomical feature, andassociate the annotation with at least a portion of the video stream.
- View Dependent Claims (19, 20, 21, 22, 23, 24, 25, 26, 27, 28)
- - 19. An apparatus as recited in claim 18, further comprising:
    - a video capture circuit to capture the video stream; and
      
      a non-volatile mass storage device to store the captured video stream.
  - 20. An apparatus as recited in claim 18, further comprising:
    - a network interface to enable the apparatus to transmit the video stream over a network;
      
      a non-volatile mass storage device to store at least a portion of the video stream; and
      
      a display device to display images from the video stream.
  - 21. An apparatus as recited in claim 18, further comprising a video recording circuit to record the video stream, wherein the image capture device associates the annotation with the video stream by associating the annotation with the recorded video stream.
  - 22. An apparatus as recited in claim 18, wherein the image capture device further associates a second annotation that corresponds to a second spoken utterance with at least a second portion of the video stream, wherein the second annotation comprises a text object.
  - 23. An apparatus as recited in claim 18, wherein the non-text visual object is a pointer or hollow shape.
  - 24. An apparatus as recited in claim 18, wherein the image capture device further associates a second annotation that corresponds to a second spoken utterance with at least a second portion of the video stream, wherein the second annotation comprises an audio object.
  - 25. An apparatus as recited in claim 18, wherein at least one of the annotations is defined by an end user of the endoscopic system.
  - 26. An apparatus as recited in claim 18, wherein the image capture device associates the annotation with the video stream by storing the annotation in a closed-caption portion of a frame.
  - 27. An apparatus as recited in claim 18, wherein the image capture device associates the annotation with the video stream by embedding the annotation in a video portion of a frame.
  - 28. An apparatus as recited in claim 18, wherein the image capture device associates the annotation with the video stream by appending the annotation to the end of the video stream.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Stryker Corporation
Original Assignee
Stryker Corporation
Inventors
Hameed, Salmaan, Mahadik, Amit A., Javadekar, Kiran A., Raghavan, Prabhu, Verma, Nirali M., Balasubramanian, Anantharaman
Primary Examiner(s)
HUYNH, THU V

Application Number

US10/965,568
Time in Patent Office

3,135 Days
Field of Search

715230-233, 715201-203
US Class Current

715/230
CPC Class Codes

A61B 1/0004   for electronic operation

G06F 16/7867   using information manually ...

G16H 30/40   for processing medical imag...

Voice-responsive annotation of video generated by an endoscopic camera

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

Citations

28 Claims

Specification

Solutions

Use Cases

Quick Links

Voice-responsive annotation of video generated by an endoscopic camera

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

28 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links