Compressed recurrent neural network models

US 10,515,307 B2
Filed: 06/03/2016
Issued: 12/24/2019
Est. Priority Date: 06/05/2015
Status: Active Grant

First Claim

Patent Images

1. A system for receiving a system input comprising a respective neural network input at each of a plurality of time steps and providing, in response to the received system input, a system output comprising a respective neural network output at each of the plurality of time steps, the system comprising:

a recurrent neural network implemented by one or more computers, wherein the recurrent neural network is configured to receive the respective neural network input at each of the plurality of time steps and to generate the respective neural network output at each of the plurality of time steps, and wherein the recurrent neural network comprises;

a first long short-term memory (LSTM) layer, wherein the first LSTM layer is configured to, for each of the plurality of time steps, generate a new layer state and a new layer output by applying a plurality of gates to a current layer input, a current layer state, and a current layer output, each of the plurality of gates being configured to, for each of the plurality of time steps, generate a respective intermediate gate output vector by multiplying a gate input vector and a gate parameter matrix, andwherein the gate parameter matrix for at least one of the plurality of gates is a Toeplitz-like structured matrix.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for implementing long-short term memory layers with compressed gating functions. One of the systems includes a first long short-term memory (LSTM) layer, wherein the first LSTM layer is configured to, for each of the plurality of time steps, generate a new layer state and a new layer output by applying a plurality of gates to a current layer input, a current layer state, and a current layer output, each of the plurality of gates being configured to, for each of the plurality of time steps, generate a respective intermediate gate output vector by multiplying a gate input vector and a gate parameter matrix. The gate parameter matrix for at least one of the plurality of gates is a structured matrix or is defined by a compressed parameter matrix and a projection matrix.

6 Citations

View as Search Results

20 Claims

1. A system for receiving a system input comprising a respective neural network input at each of a plurality of time steps and providing, in response to the received system input, a system output comprising a respective neural network output at each of the plurality of time steps, the system comprising:
- a recurrent neural network implemented by one or more computers, wherein the recurrent neural network is configured to receive the respective neural network input at each of the plurality of time steps and to generate the respective neural network output at each of the plurality of time steps, and wherein the recurrent neural network comprises;
  
  a first long short-term memory (LSTM) layer, wherein the first LSTM layer is configured to, for each of the plurality of time steps, generate a new layer state and a new layer output by applying a plurality of gates to a current layer input, a current layer state, and a current layer output, each of the plurality of gates being configured to, for each of the plurality of time steps, generate a respective intermediate gate output vector by multiplying a gate input vector and a gate parameter matrix, andwherein the gate parameter matrix for at least one of the plurality of gates is a Toeplitz-like structured matrix.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The system of claim 1, wherein the recurrent neural network comprises a second LSTM layer, wherein the second LSTM layer is configured to, for each of the plurality of time steps, generate a second new layer state and a second new layer output by applying a second plurality of gates to a second current layer input, a second current layer state, and a second current layer output, each of the second plurality of gates being configured to, for each of the plurality of time steps, generate a respective second intermediate gate output vector by multiplying a second gate input vector and a second gate parameter matrix, andwherein the gate parameter matrix for at least one of the second plurality of gates is defined by a compressed parameter matrix and a projection matrix.
  - 3. The system of claim 2, wherein the first LSTM layer and the second LSTM layer are each one of a plurality of LSTM layers in an ordered stack of layers.
  - 4. The system of claim 3, wherein the first LSTM layer is lower in the stack than the second LSTM layer.
  - 5. The system of claim 1, wherein each of the plurality of gates are configured to, for each of the plurality of time steps, apply a respective gating function to each component of the respective intermediate gate output vector to generate a respective final gate output vector.
  - 6. The system of claim 1, wherein the neural network is an acoustic model.
  - 7. The system of claim 1, wherein the neural network is a speech recognition model.
  - 8. The system of claim 1, wherein the neural network is compressed by at least 75% of an uncompressed version of the neural network.
  - 9. The system of claim 1, wherein a word error rate of the neural network is within 0.3% of a word error rate of an uncompressed version of the neural network.

10. A system for receiving a system input comprising a respective neural network input at each of a plurality of time steps and providing, in response to the received system input, a system output comprising a respective neural network output at each of the plurality of time steps, the system comprising:
- a recurrent neural network implemented by one or more computers, wherein the recurrent neural network is configured to receive the respective neural network input at each of the plurality of time steps and to generate the respective neural network output at each of the plurality of time steps, and wherein the recurrent neural network comprises;
  
  a first long short-term memory (LSTM) layer, wherein the first LSTM layer is configured to, for each of the plurality of time steps, generate a new layer state and a new layer output by applying a plurality of gates to a current layer input, a current layer state, and a current layer output, each of the plurality of gates being configured to, for each of the plurality of time steps, generate a respective intermediate gate output vector by multiplying a gate input vector and a gate parameter matrix, andwherein the gate parameter matrix for at least one of the plurality of gates is defined by a compressed parameter matrix and a projection matrix.
- View Dependent Claims (11, 12, 13, 14, 15, 16, 17, 18)
- - 11. The system of claim 10, wherein the recurrent neural network comprises a second LSTM layer, wherein the second LSTM layer is configured to, for each of the plurality of time steps, generate a second new layer state and a second new layer output by applying a second plurality of gates to a second current layer input, a second current layer state, and a second current layer output, each of the second plurality of gates being configured to, for each of the plurality of time steps, generate a respective second intermediate gate output vector by multiplying a second gate input vector and a second gate parameter matrix, andwherein the gate parameter matrix for at least one of the second plurality of gates is a Toeplitz-like structured matrix.
  - 12. The system of claim 11, wherein the first LSTM layer and the second LSTM layer are each one of a plurality of LSTM layers in an ordered stack of layers.
  - 13. The system of claim 12, wherein the second LSTM layer is lower in the stack than the first LSTM layer.
  - 14. The system of claim 10, wherein each of the plurality of gates are configured to, for each of the plurality of time steps, apply a respective gating function to each component of the respective intermediate gate output vector to generate a respective final gate output vector.
  - 15. The system of claim 10, wherein the neural network is an acoustic model.
  - 16. The system of claim 10, wherein the neural network is a speech recognition model.
  - 17. The system of claim 10, wherein the neural network is compressed by at least 75% of an uncompressed version of the neural network.
  - 18. The system of claim 10, wherein a word error rate of the neural network is within 0.3% of a word error rate of an uncompressed version of the neural network.

19. One or more non-transitory computer storage media encoded with a computer program product, the computer program product comprising instructions that when executed by one or more computers cause the one or more computers to implement a system for receiving a system input comprising a respective neural network input at each of a plurality of time steps and providing, in response to the received system input, a system output comprising a respective neural network output at each of the plurality of time steps, the system comprising:
- a recurrent neural network, wherein the recurrent neural network is configured to receive the respective neural network input at each of the plurality of time steps and to generate the respective neural network output at each of the plurality of time steps, and wherein the recurrent neural network comprises;
  
  a first long short-term memory (LSTM) layer, wherein the first LSTM layer is configured to, for each of the plurality of time steps, generate a new layer state and a new layer output by applying a plurality of gates to a current layer input, a current layer state, and a current layer output, each of the plurality of gates being configured to, for each of the plurality of time steps, generate a respective intermediate gate output vector by multiplying a gate input vector and a gate parameter matrix, andwherein the gate parameter matrix for at least one of the plurality of gates is a Toeplitz-like structured matrix.

20. One or more non-transitory computer storage media encoded with a computer program product, the computer program product comprising instructions that when executed by one or more computers cause the one or more computers to implement a system for receiving a system input comprising a respective neural network input at each of a plurality of time steps and providing, in response to the received system input, a system output comprising a respective neural network output at each of the plurality of time steps, the system comprising:
- a recurrent neural network, wherein the recurrent neural network is configured to receive the respective neural network input at each of the plurality of time steps and to generate the respective neural network output at each of the plurality of time steps, and wherein the recurrent neural network comprises;
  
  a first long short-term memory (LSTM) layer, wherein the first LSTM layer is configured to, for each of the plurality of time steps, generate a new layer state and a new layer output by applying a plurality of gates to a current layer input, a current layer state, and a current layer output, each of the plurality of gates being configured to, for each of the plurality of time steps, generate a respective intermediate gate output vector by multiplying a gate input vector and a gate parameter matrix, andwherein the gate parameter matrix for at least one of the plurality of gates is defined by a compressed parameter matrix and a projection matrix.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Google LLC (Alphabet Inc.)
Original Assignee
Google LLC (Alphabet Inc.)
Inventors
Sainath, Tara N., Sindhwani, Vikas
Primary Examiner(s)
Vincent, David R

Application Number

US15/172,457
Publication Number

US 20170076196A1
Time in Patent Office

1,299 Days
Field of Search

706 15, 706 45
US Class Current
CPC Class Codes

G06N 3/044 Recurrent networks, e.g. Ho...

G06N 3/084 Backpropagation, e.g. using...

Compressed recurrent neural network models

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

6 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Compressed recurrent neural network models

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

6 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links