×

Image processing with recurrent attention

  • US 10,223,617 B1
  • Filed: 06/04/2015
  • Issued: 03/05/2019
  • Est. Priority Date: 06/06/2014
  • Status: Active Grant
First Claim
Patent Images

1. A method for processing an image sequence, wherein the image sequence comprises a plurality of first images, wherein each of the plurality of first images are the same, and wherein the method comprises, for each first image:

  • determining a location in the first image, comprising;

    determining the location based on an output of a location neural network for the first image if the first image is after an initial first image in the image sequence;

    extracting a glimpse from the first image using the location;

    updating a current internal state of a recurrent neural network using the glimpse extracted from the first image to generate a new internal state, comprising;

    generating a glimpse representation of the extracted glimpse, andprocessing the glimpse representation using the recurrent neural network to update the current internal state of the recurrent neural network to generate a new internal state;

    processing, using the location neural network, the new internal state of the recurrent neural network generated using the glimpse extracted from the first image to generate an output of the location neural network for a next image in the image sequence after the first image;

    selecting an action from a predetermined set of possible actions, wherein each possible action in the predetermined set of possible actions defines a respective object category, including;

    processing, using an action neural network, the new internal state of the recurrent neural network to generate an action neural network output comprising a respective action score for each of the possible actions, wherein for each of the possible actions, the respective action score for the possible action represents a likelihood that the first image includes an image of an object belonging to the respective object category defined by the possible action, andselecting the action based on the action neural network output;

    wherein the location neural network, the recurrent neural network, and the action neural network have been trained by an end-to-end optimization procedure.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×