System and method for direct multi-modal annotation of objects
First Claim
1. An apparatus for direct annotation of objects, the apparatus comprising:
- a display device for displaying one or more images;
an audio input device for receiving an audio signal;
a storage device for storing a plurality of different visual notations each comprising a text or a graphic image and for storing a plurality of corresponding audio signals;
a direct annotation creation module coupled to receive the audio signal from the audio input device and to receive a reference to a location within an image on the display device, the direct annotation creation module, in response to receiving the audio signal and the reference to the location within the image, automatically creating an annotation object, independent from the image, that associates the input audio signal, the location and one of the plurality of different visual notations; and
an audio vocabulary comparison module coupled to the audio input device, the storage device and the direct annotation creation module, the audio vocabulary comparison module receiving audio input and finding a corresponding one of the plurality of different visual notations that matches content of the audio input.
1 Assignment
0 Petitions
Accused Products
Abstract
The system includes an image display system, a direct annotation creation module, an annotation display module, a vocabulary comparison module and a dynamic updating module. These modules are coupled together by a bus and provide for the direct multi-modal annotation of media of media objects. The direct annotation creation module creates annotation objects. The annotation display module works in cooperation with the image display system to display the annotations themselves or graphic representations of the annotation positioned relative to the images of the objects. The system automatically creates the annotation, associates it with the selected images, and displays either a graphic representation of the annotation or a text translation of the audio input.
-
Citations
26 Claims
-
1. An apparatus for direct annotation of objects, the apparatus comprising:
-
a display device for displaying one or more images; an audio input device for receiving an audio signal; a storage device for storing a plurality of different visual notations each comprising a text or a graphic image and for storing a plurality of corresponding audio signals; a direct annotation creation module coupled to receive the audio signal from the audio input device and to receive a reference to a location within an image on the display device, the direct annotation creation module, in response to receiving the audio signal and the reference to the location within the image, automatically creating an annotation object, independent from the image, that associates the input audio signal, the location and one of the plurality of different visual notations; and an audio vocabulary comparison module coupled to the audio input device, the storage device and the direct annotation creation module, the audio vocabulary comparison module receiving audio input and finding a corresponding one of the plurality of different visual notations that matches content of the audio input. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer program product having a computer-readable storage medium storing computer-executable code for direct annotation of objects, the code comprising:
-
a media object storage for storing media, annotation objects, a plurality of different visual notations each comprising a text or a graphic image and a plurality of corresponding audio signals; a direct annotation creation module coupled to receive an audio signal, a selected visual notation from the plurality of different visual notations and a reference to a location within an image, the direct annotation creation module, in response to receiving the audio signal or the reference to the location within the image, automatically creating an annotation object, independent of the image, that associates the audio signal, the selected visual notation and the location, and the direct annotation creation module storing the audio annotation in the media object storage; an audio vocabulary comparison module coupled to the media object storage and the direct annotation creation module, the audio vocabulary comparison module receiving audio input and finding a corresponding one of the plurality of different visual notations that matches content of the audio input; and an annotation output module coupled to the direct annotation creation module, the annotation output module generating audio or visual output in response to user selection of an annotation symbol representing the annotation object.
-
-
7. A computer implemented method for direct annotation of objects, the method comprising the steps of:
-
displaying an image; receiving audio input; detecting selection of a location within the image; comparing the audio input to a vocabulary; finding a corresponding one of a plurality of different visual notations that matches content of the audio input; and creating an annotation object, independent of the selected image, that provides an association between the image, the audio input, the selected location, the found corresponding one of a plurality of different visual notations comprising text or a graphic image, the annotation object including at least a text annotation field, an image reference field, and an annotation location field, the creating step occurring automatically in response to the receiving or detecting. - View Dependent Claims (8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A computer implemented method for displaying objects with annotations, the method comprising the steps of:
-
receiving audio input; finding a corresponding annotation object comprising one of a plurality of different visual notations, the plurality of different visual notations referencing a close match to content of the audio input; retrieving an image associated with the corresponding annotation object; displaying the image with one of the plurality of different visual notations to indicate that an annotation exists; receiving user selection of the one visual notation; generating the annotation automatically, in response to user input of a location within the image and an audio input; outputting the annotation associated with the selected visual notation; determining whether the annotation includes text; retrieving a text annotation for the selected visual notation; and displaying the retrieved text with the image. - View Dependent Claims (21, 22, 23)
-
-
24. A computer implemented method for retrieving images, the method comprising the steps of:
-
receiving audio input; finding corresponding annotation objects comprising one of a plurality of different visual notations, the plurality of different visual notations referencing a close match to content of the audio input, each corresponding annotation object generated automatically in response to user input of a location within an image and an audio signal, where a recording of the audio signal is terminated automatically based on a predetermined audio level; retrieving the images that are referenced by the found annotation objects; and displaying the retrieved images, the plurality of different visual notations for the found corresponding annotation objects and wherein each of the found corresponding annotation objects include at least an audio input field, an image reference field, and an annotation location field. - View Dependent Claims (25, 26)
-
Specification