System and method for learning latent representations for natural language tasks
First Claim
1. A method comprising:
- analyzing, for a first natural language processing task, a first natural language corpus to generate a latent representation for words in the first natural language corpus;
calculating, for each word in the latent representation, a Euclidian distance between a left context of the each word and a right context of the each word, to yield a centroid of latent vectors for each word in the latent representation;
analyzing, for a second natural language processing task, a second natural language corpus having a target word, the target word being a word that is not in the first natural language corpus; and
predicting, via a processor, a label for the target word based on the latent representation and the centroid of latent vectors for each word in the latent representation, wherein the predicting comprises iteratively executing an alternating gradient descent algorithm until convergence, the alternating gradient descent algorithm comprising, for each iteration, computing a low dimensional continuous embedding and passing the low dimensional continuous embedding through a multi-layer perceptron.
3 Assignments
0 Petitions
Accused Products
Abstract
Disclosed herein are systems, methods, and non-transitory computer-readable storage media for learning latent representations for natural language tasks. A system configured to practice the method analyzes, for a first natural language processing task, a first natural language corpus to generate a latent representation for words in the first corpus. Then the system analyzes, for a second natural language processing task, a second natural language corpus having a target word, and predicts a label for the target word based on the latent representation. In one variation, the target word is one or more word such as a rare word and/or a word not encountered in the first natural language corpus. The system can optionally assigning the label to the target word. The system can operate according to a connectionist model that includes a learnable linear mapping that maps each word in the first corpus to a low dimensional latent space.
13 Citations
20 Claims
-
1. A method comprising:
-
analyzing, for a first natural language processing task, a first natural language corpus to generate a latent representation for words in the first natural language corpus; calculating, for each word in the latent representation, a Euclidian distance between a left context of the each word and a right context of the each word, to yield a centroid of latent vectors for each word in the latent representation; analyzing, for a second natural language processing task, a second natural language corpus having a target word, the target word being a word that is not in the first natural language corpus; and predicting, via a processor, a label for the target word based on the latent representation and the centroid of latent vectors for each word in the latent representation, wherein the predicting comprises iteratively executing an alternating gradient descent algorithm until convergence, the alternating gradient descent algorithm comprising, for each iteration, computing a low dimensional continuous embedding and passing the low dimensional continuous embedding through a multi-layer perceptron. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system comprising:
-
a processor; and a computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising; analyzing, for a first natural language processing task, a first natural language corpus to generate a latent representation for words in the first natural language corpus; calculating, for each word in the latent representation, a Euclidian distance between a left context of the each word and a right context of the each word, to yield a centroid of latent vectors for each word in the latent representation; analyzing, for a second natural language processing task, a second natural language corpus having a target word, the target word being a word that is not in the first natural language corpus; and predicting, via a processor, a label for the target word based on the latent representation and the centroid of latent vectors for each word in the latent representation, wherein the predicting comprises iteratively executing an alternating gradient descent algorithm until convergence, the alternating gradient descent algorithm comprising, for each iteration, computing a low dimensional continuous embedding and passing the low dimensional continuous embedding through a multi-layer perceptron. - View Dependent Claims (11, 12, 13, 14, 15)
-
-
16. A computer-readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising:
-
analyzing, for a first natural language processing task, a first natural language corpus to generate a latent representation for words in the first natural language corpus; calculating, for each word in the latent representation, a Euclidian distance between a left context of the each word and a right context of the each word, to yield a centroid of latent vectors for each word in the latent representation; analyzing, for a second natural language processing task, a second natural language corpus having a target word, the target word being a word that is not in the first natural language corpus; and predicting, via a processor, a label for the target word based on the latent representation and the centroid of latent vectors for each word in the latent representation, wherein the predicting comprises iteratively executing an alternating gradient descent algorithm until convergence, the alternating gradient descent algorithm comprising, for each iteration, computing a low dimensional continuous embedding and passing the low dimensional continuous embedding through a multi-layer perceptron. - View Dependent Claims (17, 18, 19, 20)
-
Specification