Visual annotation using tagging sessions
First Claim
1. A method comprising:
- on a mobile device including a processor, a memory, a camera, a plurality of sensors, a microphone and a display and a touch screen sensor, receiving via an input interface on the mobile device a request to generate a multi-view interactive digital media representation (MVIDMR) of an object;
recording a first plurality of frames from the camera on the mobile device from a live video stream as the mobile device moves along a trajectory such that different views of the object are captured in the first plurality of frames;
generating the MVIDMR of the object including a second plurality of frames from the first plurality of frames wherein the different views of the object are included in each of the second plurality of frames;
outputting a first frame from the MVIDMR including a selector rendered over the first frame to the display;
receiving, via the touch screen sensor and the selector, a selection of a location on the object in the first frame;
removing the selector and rendering a first selectable tag at the location selected in the first frame;
outputting the first frame including the first selectable tag to the display;
for each remaining frame in the second plurality of frames of the MVIDMR, determining a first location where the location on the object appears in the each remaining frame including determining whether the location on the object appears in the each remaining frame;
for each remaining frame where the location on the object appears, rendering the first selectable tag into each remaining frame at the first location to generate a third plurality of frames to form a tagged MVIDMR;
outputting to the display the tagged MVIDMR;
receiving media content associated with the first selectable tag;
outputting a first frame from the third plurality of frames of the tagged MVIDMR that includes the first selectable tag;
receiving input from the touch screen sensor indicating the first selectable tag is selected in the first frame from the tagged MVIDMR; and
in response outputting the media content associated with the first selectable tag to the display.
1 Assignment
0 Petitions
Accused Products
Abstract
Various embodiments of the present invention relate generally to systems and methods for analyzing and manipulating images and video. In particular, a multi-view interactive digital media representation (MVIDMR) of an object can be generated from live images of an object captured from a camera. After the MVIDMR of the object is generated, a tag can be placed at a location on the object in the MVIDMR. The locations of the tag in the frames of the MVIDMR can vary from frame to frame as the view of the object changes. When the tag is selected, media content can be output which shows details of the object at location where the tag is placed. In one embodiment, the object can be car and tags can be used to link to media content showing details of the car at the locations where the tags are placed.
-
Citations
25 Claims
-
1. A method comprising:
-
on a mobile device including a processor, a memory, a camera, a plurality of sensors, a microphone and a display and a touch screen sensor, receiving via an input interface on the mobile device a request to generate a multi-view interactive digital media representation (MVIDMR) of an object; recording a first plurality of frames from the camera on the mobile device from a live video stream as the mobile device moves along a trajectory such that different views of the object are captured in the first plurality of frames; generating the MVIDMR of the object including a second plurality of frames from the first plurality of frames wherein the different views of the object are included in each of the second plurality of frames; outputting a first frame from the MVIDMR including a selector rendered over the first frame to the display; receiving, via the touch screen sensor and the selector, a selection of a location on the object in the first frame; removing the selector and rendering a first selectable tag at the location selected in the first frame; outputting the first frame including the first selectable tag to the display; for each remaining frame in the second plurality of frames of the MVIDMR, determining a first location where the location on the object appears in the each remaining frame including determining whether the location on the object appears in the each remaining frame; for each remaining frame where the location on the object appears, rendering the first selectable tag into each remaining frame at the first location to generate a third plurality of frames to form a tagged MVIDMR; outputting to the display the tagged MVIDMR; receiving media content associated with the first selectable tag; outputting a first frame from the third plurality of frames of the tagged MVIDMR that includes the first selectable tag; receiving input from the touch screen sensor indicating the first selectable tag is selected in the first frame from the tagged MVIDMR; and in response outputting the media content associated with the first selectable tag to the display. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25)
-
Specification