×

Compressed recurrent neural network models

  • US 10,878,319 B2
  • Filed: 12/29/2016
  • Issued: 12/29/2020
  • Est. Priority Date: 02/03/2016
  • Status: Active Grant
First Claim
Patent Images

1. A system comprising:

  • data processing hardware; and

    memory hardware in communication with the data processing hardware and storing instructions that when executed on the data processing hardware cause the data processing hardware to perform operations comprising;

    training an uncompressed version of a recurrent neural network (RNN) on training data to learn a respective recurrent weight matrix, Wh, and a respective inter-layer weight matrix, Wx, for each of a plurality of uncompressed recurrent layers of the uncompressed version of the RNN, each recurrent layer of the plurality of uncompressed recurrent layers configured to, for each of a plurality of time steps;

    receive a respective layer input for the time step; and

    process the respective layer input for the time step to generate a respective layer output for the time step;

    re-configuring the trained RNN by, for at least one recurrent layer of the plurality of uncompressed recurrent layers of the uncompressed version of the trained RNN, compressing the recurrent layer by;

    determining a respective singular value decomposition (SVD) of the respective recurrent weight matrix, Wh, for the recurrent layer;

    generating a first compressed weight matrix, Zhl, and a projection matrix, Pl, based on the respective SVD of the respective recurrent weight matrix, Wh, for the recurrent layer;

    generating a second compressed weight matrix, Zxl, based on the first compressed weight matrix, Zhl, and the projection matrix, Pl;

    replacing the respective recurrent weight matrix, Wh, with the product of the first compressed weight matrix, Zhl, and the projection matrix, Pl; and

    replacing the respective inter-layer weight matrix, Wx, with the product of the second compressed weight matrix, Zxl, and the projection matrix, Pl; and

    transmitting the re-configured trained RNN having the at least one compressed recurrent layer to a mobile device in communication with the data processing hardware, the re-configured trained RNN having the at least one compressed recurrent layer configured to receive a respective neural network input at each of multiple time steps and generate a respective neural network output at each of the multiple time steps.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×