Processing text sequences using neural networks

US 10,733,390 B2
Filed: 06/07/2019
Issued: 08/04/2020
Est. Priority Date: 10/26/2016
Status: Active Grant

First Claim

Patent Images

1. A language modeling system implemented by one or more computers, the language modeling system comprising:

a masked convolutional decoder neural network that comprises a plurality of masked convolutional neural network layers and is configured to generate a respective probability distribution over a set of possible target embeddings at each of a plurality of time steps, comprising, at each time step of the plurality of time steps;

processing target embeddings corresponding to previous time steps using the plurality of masked convolutional neural network layers of the masked convolutional decoder neural network to generate a current probability distribution over the set of possible target embeddings;

wherein each target embedding in the set of possible target embeddings corresponds to a respective character or word in a natural language; and

instructions that, when executed by the one or more computers, cause the one or more computers to perform operations comprising using the respective probability distribution generated by the decoder neural network at each of the plurality of time steps to estimate a probability that a string represented by the target embeddings corresponding to the plurality of time steps belongs to the natural language.

View all claims

1 Assignment

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for language modeling. In one aspect, a system comprises: a masked convolutional decoder neural network that comprises a plurality of masked convolutional neural network layers and is configured to generate a respective probability distribution over a set of possible target embeddings at each of a plurality of time steps; and a modeling engine that is configured to use the respective probability distribution generated by the decoder neural network at each of the plurality of time steps to estimate a probability that a string represented by the target embeddings corresponding to the plurality of time steps belongs to the natural language.

36 Citations

View as Search Results

17 Claims

1. A language modeling system implemented by one or more computers, the language modeling system comprising:
- a masked convolutional decoder neural network that comprises a plurality of masked convolutional neural network layers and is configured to generate a respective probability distribution over a set of possible target embeddings at each of a plurality of time steps, comprising, at each time step of the plurality of time steps;
  
  processing target embeddings corresponding to previous time steps using the plurality of masked convolutional neural network layers of the masked convolutional decoder neural network to generate a current probability distribution over the set of possible target embeddings;
  
  wherein each target embedding in the set of possible target embeddings corresponds to a respective character or word in a natural language; and
  
  instructions that, when executed by the one or more computers, cause the one or more computers to perform operations comprising using the respective probability distribution generated by the decoder neural network at each of the plurality of time steps to estimate a probability that a string represented by the target embeddings corresponding to the plurality of time steps belongs to the natural language.
- View Dependent Claims (2, 3, 4, 5, 6, 7)
- - 2. The language modeling system of claim 1, wherein one or more of the plurality of masked convolutional neural network layers are one-dimensional masked convolutional neural network layers.
  - 3. The language modeling system of claim 1, wherein one or more of the plurality of masked convolutional neural network layers are masked dilated convolutional neural network layers.
  - 4. The language modeling system of claim 3, wherein the plurality of masked convolutional neural network layers are grouped into a plurality of groups, and wherein, within each group, dilation rates of masked dilated convolutional neural network layers in the group are doubled every layer.
  - 5. The language modeling system of claim 1, wherein one or more of the plurality of masked convolutional neural network layers are wrapped in a residual block that contains one or more additional convolutional layers.
  - 6. The language modeling system of claim 5, wherein the residual block includes a rectified linear unit (ReLU) activation layer.
  - 7. The language modeling system of claim 5, wherein the residual block includes a multiplicative unit activation layer.

8. One or more non-transitory computer readable storage media storing instructions executable by a data processing apparatus and that upon such execution causes the data processing apparatus to perform language modeling operations comprising:
- using a masked convolutional decoder neural network that comprises a plurality of masked convolutional neural network layers to generate a respective probability distribution over a set of possible target embeddings at each of a plurality of time steps, comprising, at each time step of the plurality of time steps;
  
  processing target embeddings corresponding to previous time steps using the plurality of masked convolutional neural network layers of the masked convolutional decoder neural network to generate a current probability distribution over the set of possible target embeddings;
  
  wherein each target embedding in the set of possible target embeddings corresponds to a respective character or word in a natural language; and
  
  using the respective probability distribution generated by the decoder neural network at each of the plurality of time steps to estimate a probability that a string represented by the target embeddings corresponding to the plurality of time steps belongs to the natural language.
- View Dependent Claims (9, 10, 11, 12, 13, 14)
- - 9. The non-transitory computer readable storage media of claim 8, wherein one or more of the plurality of masked convolutional neural network layers are one-dimensional masked convolutional neural network layers.
  - 10. The non-transitory computer readable storage media of claim 8, wherein one or more of the plurality of masked convolutional neural network layers are masked dilated convolutional neural network layers.
  - 11. The non-transitory computer readable storage media of claim 10, wherein the plurality of masked convolutional neural network layers are grouped into a plurality of groups, and wherein, within each group, dilation rates of masked dilated convolutional neural network layers in the group are doubled every layer.
  - 12. The non-transitory computer readable storage media of claim 8, wherein one or more of the plurality of masked convolutional neural network layers are wrapped in a residual block that contains one or more additional convolutional layers.
  - 13. The non-transitory computer readable storage media of claim 12, wherein the residual block includes a rectified linear unit (ReLU) activation layer.
  - 14. The non-transitory computer readable storage media of claim 12, wherein the residual block includes a multiplicative unit activation layer.

15. A language modeling method performing by one or more data processing apparatus, the language modeling method comprising:
- using a masked convolutional decoder neural network that comprises a plurality of masked convolutional neural network layers to generate a respective probability distribution over a set of possible target embeddings at each of a plurality of time steps, comprising, at each time step of the plurality of time steps;
  
  processing target embeddings corresponding to previous time steps using the plurality of masked convolutional neural network layers of the masked convolutional decoder neural network to generate a current probability distribution over the set of possible target embeddings;
  
  wherein each target embedding in the set of possible target embeddings corresponds to a respective character or word in a natural language; and
  
  using the respective probability distribution generated by the decoder neural network at each of the plurality of time steps to estimate a probability that a string represented by the target embeddings corresponding to the plurality of time steps belongs to the natural language.
- View Dependent Claims (16, 17)
- - 16. The language modeling method of claim 15, wherein one or more of the plurality of masked convolutional neural network layers are one-dimensional masked convolutional neural network layers.
  - 17. The language modeling method of claim 15, wherein one or more of the plurality of masked convolutional neural network layers are masked dilated convolutional neural network layers.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
DeepMind Technologies Limited (Alphabet Inc.)
Original Assignee
DeepMind Technologies Limited (Alphabet Inc.)
Inventors
Kalchbrenner, Nal Emmerich, Simonyan, Karen, Espeholt, Lasse
Primary Examiner(s)
McFadden, Susan I

Application Number

US16/434,459
Publication Number

US 20190286708A1
Time in Patent Office

424 Days
Field of Search

704 2
US Class Current
CPC Class Codes

G06F 40/44   Statistical methods, e.g. p...

G06F 40/58   Use of machine translation,...

G06N 3/045   Combinations of networks

G06N 3/047   Probabilistic or stochastic...

G06N 3/084   Backpropagation, e.g. using...

G10L 15/197   Probabilistic grammars, e.g...

Processing text sequences using neural networks

First Claim

1 Assignment

0 Petitions

Accused Products

Abstract

36 Citations

17 Claims

Specification

Solutions

Use Cases

Quick Links

Processing text sequences using neural networks

First Claim

1 Assignment

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

36 Citations

17 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links