Augmenting Neural Networks with External Memory
First Claim
1. An augmented neural network system for processing a sequence of system inputs to generate a sequence of system outputs, the augmented neural network system comprising:
- a controller neural network configured to receive a neural network input at each of a plurality of time steps and to process the neural network input to generate a neural network output for the time step, wherein each neural network output includes;
a read key, anda write vector;
an external memory; and
a Least Recently Used Access (LRUA) subsystem that is configured to;
maintain a respective usage weight for each of a plurality of locations in the external memory that represents a strength with which the location has recently been written to or read from by the LRUA subsystem, andfor each of the plurality of time steps;
generate a respective reading weight for each of the plurality of locations in the external memory using the read key,read data from the plurality of locations in the external memory in accordance with the reading weights,generate a respective writing weight for each of the plurality of locations in the external memory from a respective reading weight for the location from a preceding time step and the respective usage weight for the location,write the write vector to the plurality of locations in the external memory in accordance with the writing weights, andupdate the respective usage weight for each of the plurality of locations in the external memory from the respective reading weight for the location and the respective writing weight for the location.
4 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for augmenting neural networks with an external memory. One of the systems includes a controller neural network that includes a Least Recently Used Access (LRUA) subsystem configured to: maintain a respective usage weight for each of a plurality of locations in the external memory, and for each of the plurality of time steps: generate a respective reading weight for each location using a read key, read data from the locations in accordance with the reading weights, generate a respective writing weight for each of the locations from a respective reading weight from a preceding time step and the respective usage weight for the location, write a write vector to the locations in accordance with the writing weights, and update the respective usage weight from the respective reading weight and the respective writing weight.
-
Citations
20 Claims
-
1. An augmented neural network system for processing a sequence of system inputs to generate a sequence of system outputs, the augmented neural network system comprising:
-
a controller neural network configured to receive a neural network input at each of a plurality of time steps and to process the neural network input to generate a neural network output for the time step, wherein each neural network output includes; a read key, and a write vector; an external memory; and a Least Recently Used Access (LRUA) subsystem that is configured to; maintain a respective usage weight for each of a plurality of locations in the external memory that represents a strength with which the location has recently been written to or read from by the LRUA subsystem, and for each of the plurality of time steps; generate a respective reading weight for each of the plurality of locations in the external memory using the read key, read data from the plurality of locations in the external memory in accordance with the reading weights, generate a respective writing weight for each of the plurality of locations in the external memory from a respective reading weight for the location from a preceding time step and the respective usage weight for the location, write the write vector to the plurality of locations in the external memory in accordance with the writing weights, and update the respective usage weight for each of the plurality of locations in the external memory from the respective reading weight for the location and the respective writing weight for the location. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A computer-implemented method for processing a sequence of system inputs to generate a sequence of system outputs, the method comprising:
-
maintaining a respective usage weight for each of a plurality of locations in an external memory that represents a strength with which the location has recently been written to or read from; receiving a neural network input; processing the neural network input using a controller neural network to generate a neural network output, wherein the neural network output includes a read key and a write vector; generating a respective reading weight for each of the plurality of locations in the external memory using the read key; reading data from the plurality of locations in the external memory in accordance with the reading weights; generating a respective writing weight for each of the plurality of locations in the external memory from a respective reading weight for the location from a preceding time step and the respective usage weight for the location; writing the write vector to the plurality of locations in the external memory in accordance with the writing weights; and updating the respective usage weight for each of the plurality of locations in the external memory from the respective reading weight for the location and the respective writing weight for the location. - View Dependent Claims (15, 16, 17, 18, 19)
-
-
20. One or more non-transitory computer-readable media storing instructions that, when executed by one or more computers, cause the one or more computers to perform operations for generating a sequence of outputs from a sequence of inputs, the operations comprising:
-
maintaining a respective usage weight for each of a plurality of locations in an external memory that represents a strength with which the location has recently been written to or read from; receiving a neural network input; processing the neural network input using a controller neural network to generate a neural network output, wherein the neural network output includes a read key and a write vector; generating a respective reading weight for each of the plurality of locations in the external memory using the read key; reading data from the plurality of locations in the external memory in accordance with the reading weights; generating a respective writing weight for each of the plurality of locations in the external memory from a respective reading weight for the location from a preceding time step and the respective usage weight for the location; writing the write vector to the plurality of locations in the external memory in accordance with the writing weights; and updating the respective usage weight for each of the plurality of locations in the external memory from the respective reading weight for the location and the respective writing weight for the location.
-
Specification