SENTINEL LONG SHORT-TERM MEMORY (Sn-LSTM)
First Claim
1. A recurrent neural network system (abbreviated RNN) running on numerous parallel processors, comprising:
- a sentinel long short-term memory network (abbreviated Sn-LSTM) thatreceives inputs at each of a plurality of timesteps, the inputs including at leastan input for a current timestep,a hidden state from a previous timestep, andan auxiliary input for the current timestep;
generates outputs at each of the plurality of timesteps by processing the inputs through gates of the Sn-LSTM, the gates including at leastan input gate,a forget gate,an output gate, andan auxiliary sentinel gate;
stores in a memory cell of the Sn-LSTM auxiliary information accumulated over time fromprocessing of the inputs by the input gate, the forget gate, and the output gate, andupdating of the memory cell with gate outputs produced by the input gate, the forget gate, and the output gate; and
the auxiliary sentinel gate modulates the stored auxiliary information from the memory cell for next prediction, with the modulation conditioned on the input for the current timestep, the hidden state from the previous timestep, and the auxiliary input for the current timestep.
1 Assignment
0 Petitions
Accused Products
Abstract
The technology disclosed presents a novel spatial attention model that uses current hidden state information of a decoder long short-term memory (LSTM) to guide attention and to extract spatial image features for use in image captioning. The technology disclosed also presents a novel adaptive attention model for image captioning that mixes visual information from a convolutional neural network (CNN) and linguistic information from an LSTM. At each timestep, the adaptive attention model automatically decides how heavily to rely on the image, as opposed to the linguistic model, to emit the next caption word. The technology disclosed further adds a new auxiliary sentinel gate to an LSTM architecture and produces a sentinel LSTM (Sn-LSTM). The sentinel gate produces a visual sentinel at each timestep, which is an additional representation, derived from the LSTM'"'"'s memory, of long and short term visual and linguistic information.
-
Citations
29 Claims
-
1. A recurrent neural network system (abbreviated RNN) running on numerous parallel processors, comprising:
-
a sentinel long short-term memory network (abbreviated Sn-LSTM) that receives inputs at each of a plurality of timesteps, the inputs including at least an input for a current timestep, a hidden state from a previous timestep, and an auxiliary input for the current timestep; generates outputs at each of the plurality of timesteps by processing the inputs through gates of the Sn-LSTM, the gates including at least an input gate, a forget gate, an output gate, and an auxiliary sentinel gate; stores in a memory cell of the Sn-LSTM auxiliary information accumulated over time from processing of the inputs by the input gate, the forget gate, and the output gate, and updating of the memory cell with gate outputs produced by the input gate, the forget gate, and the output gate; and the auxiliary sentinel gate modulates the stored auxiliary information from the memory cell for next prediction, with the modulation conditioned on the input for the current timestep, the hidden state from the previous timestep, and the auxiliary input for the current timestep. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 25)
-
-
19. A sentinel long short-term memory network (abbreviated Sn-LSTM), running on numerous parallel processors, that processes auxiliary input combined with input and previous hidden state, comprising:
an auxiliary sentinel gate that applies on a memory cell of the Sn-LSTM, and modulates use of auxiliary information during next prediction, the auxiliary information accumulated over time in the memory cell at least from the processing of the auxiliary input combined with the input and the previous hidden state. - View Dependent Claims (20, 21, 26)
-
22. A method, including:
extending a long short-term memory network (abbreviated LSTM) to include an auxiliary sentinel gate that applies on a memory cell of the LSTM, and modulates use of auxiliary information during next prediction, the auxiliary information accumulated over time in the memory cell at least from the processing of auxiliary input combined with current input and previous hidden state. - View Dependent Claims (23, 24, 27, 28)
-
29. A recurrent neural network system (abbreviated RNN) running on numerous parallel processors for machine generation of a natural language caption for an image, comprising:
-
an input provider for providing a plurality of inputs to a sentinel long short-term memory network (abbreviated Sn-LSTM) over successive timesteps, wherein the inputs include at least an input for a current timestep, a hidden state from a previous timestep, and an auxiliary input for the current timestep; a gate processor for processing the inputs through each gate in a plurality of gates of the Sn-LSTM, wherein the gates include at least an input gate, a forget gate, an output gate, and an auxiliary sentinel gate; a memory cell of the Sn-LSTM for storing auxiliary information accumulated over time from processing of the inputs by the gate processor; a memory cell updater for updating the memory cell with gate outputs produced by the input gate, the forget gate, and the output gate; the auxiliary sentinel gate for modulating the stored auxiliary information from the memory cell to produce a sentinel state at each timestep, with the modulation conditioned on the input for the current timestep, the hidden state from the previous timestep, and the auxiliary input for the current timestep; and an emitter for generating the natural language caption for the image based on the sentinel states produced over successive timesteps by the auxiliary sentinel gate.
-
Specification