Processing text sequences using neural networks
First Claim
1. A language modeling system implemented by one or more computers, the language modeling system comprising:
- a masked convolutional decoder neural network that comprises a plurality of masked convolutional neural network layers and is configured to generate a respective probability distribution over a set of possible target embeddings at each of a plurality of time steps, comprising, at each time step of the plurality of time steps;
processing target embeddings corresponding to previous time steps using the plurality of masked convolutional neural network layers of the masked convolutional decoder neural network to generate a current probability distribution over the set of possible target embeddings;
wherein each target embedding in the set of possible target embeddings corresponds to a respective character or word in a natural language; and
instructions that, when executed by the one or more computers, cause the one or more computers to perform operations comprising using the respective probability distribution generated by the decoder neural network at each of the plurality of time steps to estimate a probability that a string represented by the target embeddings corresponding to the plurality of time steps belongs to the natural language.
1 Assignment
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for language modeling. In one aspect, a system comprises: a masked convolutional decoder neural network that comprises a plurality of masked convolutional neural network layers and is configured to generate a respective probability distribution over a set of possible target embeddings at each of a plurality of time steps; and a modeling engine that is configured to use the respective probability distribution generated by the decoder neural network at each of the plurality of time steps to estimate a probability that a string represented by the target embeddings corresponding to the plurality of time steps belongs to the natural language.
36 Citations
17 Claims
-
1. A language modeling system implemented by one or more computers, the language modeling system comprising:
-
a masked convolutional decoder neural network that comprises a plurality of masked convolutional neural network layers and is configured to generate a respective probability distribution over a set of possible target embeddings at each of a plurality of time steps, comprising, at each time step of the plurality of time steps; processing target embeddings corresponding to previous time steps using the plurality of masked convolutional neural network layers of the masked convolutional decoder neural network to generate a current probability distribution over the set of possible target embeddings; wherein each target embedding in the set of possible target embeddings corresponds to a respective character or word in a natural language; and instructions that, when executed by the one or more computers, cause the one or more computers to perform operations comprising using the respective probability distribution generated by the decoder neural network at each of the plurality of time steps to estimate a probability that a string represented by the target embeddings corresponding to the plurality of time steps belongs to the natural language. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. One or more non-transitory computer readable storage media storing instructions executable by a data processing apparatus and that upon such execution causes the data processing apparatus to perform language modeling operations comprising:
-
using a masked convolutional decoder neural network that comprises a plurality of masked convolutional neural network layers to generate a respective probability distribution over a set of possible target embeddings at each of a plurality of time steps, comprising, at each time step of the plurality of time steps; processing target embeddings corresponding to previous time steps using the plurality of masked convolutional neural network layers of the masked convolutional decoder neural network to generate a current probability distribution over the set of possible target embeddings; wherein each target embedding in the set of possible target embeddings corresponds to a respective character or word in a natural language; and using the respective probability distribution generated by the decoder neural network at each of the plurality of time steps to estimate a probability that a string represented by the target embeddings corresponding to the plurality of time steps belongs to the natural language. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. A language modeling method performing by one or more data processing apparatus, the language modeling method comprising:
-
using a masked convolutional decoder neural network that comprises a plurality of masked convolutional neural network layers to generate a respective probability distribution over a set of possible target embeddings at each of a plurality of time steps, comprising, at each time step of the plurality of time steps; processing target embeddings corresponding to previous time steps using the plurality of masked convolutional neural network layers of the masked convolutional decoder neural network to generate a current probability distribution over the set of possible target embeddings; wherein each target embedding in the set of possible target embeddings corresponds to a respective character or word in a natural language; and using the respective probability distribution generated by the decoder neural network at each of the plurality of time steps to estimate a probability that a string represented by the target embeddings corresponding to the plurality of time steps belongs to the natural language. - View Dependent Claims (16, 17)
-
Specification