×

Method and apparatus for 3-D auto tagging

  • US 10,592,747 B2
  • Filed: 11/02/2018
  • Issued: 03/17/2020
  • Est. Priority Date: 04/26/2018
  • Status: Active Grant
First Claim
Patent Images

1. A method comprising:

  • on a mobile device including a processor, a memory, a camera, a plurality of sensors, a microphone and a display and a touch screen sensor, receiving via an input interface on the mobile device a request to generate a multi-view interactive digital media representation (MVIDMR) of an object;

    recording a first plurality of frames from the camera on the mobile device from a live video stream as the mobile device moves along a trajectory such that different views of the object are captured in the first plurality of frames;

    generating the MVIDMR of the object including a second plurality of frames from the first plurality of frames wherein the different views of the object are included in each of the second plurality of frames;

    using a machine learning algorithm on the second plurality of frames to generate heatmaps and part affinity fields associated with possible 2-D pixel locations of a plurality of landmarks on the object wherein the machine learning algorithm is trained to recognize the plurality of landmarks;

    based upon the heatmaps and part affinity fields, determining a skeleton for the object wherein the plurality of landmarks form joints of the skeleton and wherein determining the skeleton includes determining the 2-D pixel locations of the joints;

    rendering a first selectable tag into the second plurality of frames to form a third plurality of frames associated with a tagged MVIDMR wherein the first selectable tag is associated with a first landmark positioned at a first joint within the skeleton and wherein the first selectable tag is rendered into the second plurality frames relative to first 2-D pixel locations determined for the first joint in the second plurality of frames;

    receiving media content associated with the first selectable tag;

    outputting a first frame from the third plurality of frames of the tagged MVIDMR that includes the first selectable tag;

    receiving input from the touch screen sensor indicating the first selectable tag is selected in the first frame from the tagged MVIDMR; and

    in response, outputting the media content associated with the first selectable tag to the display.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×