System and method for learning latent representations for natural language tasks

US 9,135,241 B2
Filed: 12/08/2010
Issued: 09/15/2015
Est. Priority Date: 12/08/2010
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

analyzing, for a first natural language processing task, a first natural language corpus to generate a latent representation for words in the first natural language corpus;

calculating, for each word in the latent representation, a Euclidian distance between a left context of the each word and a right context of the each word, to yield a centroid of latent vectors for each word in the latent representation;

analyzing, for a second natural language processing task, a second natural language corpus having a target word, the target word being a word that is not in the first natural language corpus; and

predicting, via a processor, a label for the target word based on the latent representation and the centroid of latent vectors for each word in the latent representation, wherein the predicting comprises iteratively executing an alternating gradient descent algorithm until convergence, the alternating gradient descent algorithm comprising, for each iteration, computing a low dimensional continuous embedding and passing the low dimensional continuous embedding through a multi-layer perceptron.

View all claims

3 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Disclosed herein are systems, methods, and non-transitory computer-readable storage media for learning latent representations for natural language tasks. A system configured to practice the method analyzes, for a first natural language processing task, a first natural language corpus to generate a latent representation for words in the first corpus. Then the system analyzes, for a second natural language processing task, a second natural language corpus having a target word, and predicts a label for the target word based on the latent representation. In one variation, the target word is one or more word such as a rare word and/or a word not encountered in the first natural language corpus. The system can optionally assigning the label to the target word. The system can operate according to a connectionist model that includes a learnable linear mapping that maps each word in the first corpus to a low dimensional latent space.

13 Citations

View as Search Results

20 Claims

1. A method comprising:
- analyzing, for a first natural language processing task, a first natural language corpus to generate a latent representation for words in the first natural language corpus;
  
  calculating, for each word in the latent representation, a Euclidian distance between a left context of the each word and a right context of the each word, to yield a centroid of latent vectors for each word in the latent representation;
  
  analyzing, for a second natural language processing task, a second natural language corpus having a target word, the target word being a word that is not in the first natural language corpus; and
  
  predicting, via a processor, a label for the target word based on the latent representation and the centroid of latent vectors for each word in the latent representation, wherein the predicting comprises iteratively executing an alternating gradient descent algorithm until convergence, the alternating gradient descent algorithm comprising, for each iteration, computing a low dimensional continuous embedding and passing the low dimensional continuous embedding through a multi-layer perceptron.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method of claim 1, wherein the target word is one of a rare word and a word not encountered in the first natural language corpus.
  - 3. The method of claim 1, wherein predicting the label for the target word is further based on a connectionist model.
  - 4. The method of claim 3, wherein the connectionist model comprises a learnable linear mapping which maps each word in the first natural language corpus to a low dimensional latent space.
  - 5. The method of claim 3, wherein the connectionist model comprises a classifier that classifies low dimensional representations of words.
  - 6. The method of claim 1, further comprising assigning the label to the target word.
  - 7. The method of claim 1, wherein the second natural language corpus comprises an input sentence, and wherein the method further comprises performing the predicting of the label for each word in the input sentence in parallel.
  - 8. The method of claim 1, wherein the second natural language processing task is a supertagging task, the supertagging task comprising assigning a lexical entry to the target word.
  - 9. The method of claim 1, wherein the target word is a collection of target words.

10. A system comprising:
- a processor; and
  
  a computer-readable storage medium having instructions stored which, when executed by the processor, cause the processor to perform operations comprising;
  
  analyzing, for a first natural language processing task, a first natural language corpus to generate a latent representation for words in the first natural language corpus;
  
  calculating, for each word in the latent representation, a Euclidian distance between a left context of the each word and a right context of the each word, to yield a centroid of latent vectors for each word in the latent representation;
  
  analyzing, for a second natural language processing task, a second natural language corpus having a target word, the target word being a word that is not in the first natural language corpus; and
  
  predicting, via a processor, a label for the target word based on the latent representation and the centroid of latent vectors for each word in the latent representation, wherein the predicting comprises iteratively executing an alternating gradient descent algorithm until convergence, the alternating gradient descent algorithm comprising, for each iteration, computing a low dimensional continuous embedding and passing the low dimensional continuous embedding through a multi-layer perceptron.
- View Dependent Claims (11, 12, 13, 14, 15)
- - 11. The system of claim 10, wherein the target word is one of a rare word and a word not encountered in the first natural language corpus.
  - 12. The system of claim 10, the computer-readable storage medium having additional instructions stored which, when executed by the processor, result in operations comprising:
    - predicting the label for the target word based on a connectionist model.
  - 13. The system of claim 12, wherein the connectionist model comprises a learnable linear mapping which maps each word in the first natural language corpus to a low dimensional latent space.
  - 14. The system of claim 12, wherein the connectionist model comprises a classifier that classifies low dimensional representations of words.
  - 15. The system of claim 10, the computer-readable storage medium having additional instructions stored which, when executed by the processor, result in operations comprising:
    - assigning the label to the target word.

16. A computer-readable storage device having instructions stored which, when executed by a computing device, cause the computing device to perform operations comprising:
- analyzing, for a first natural language processing task, a first natural language corpus to generate a latent representation for words in the first natural language corpus;
  
  calculating, for each word in the latent representation, a Euclidian distance between a left context of the each word and a right context of the each word, to yield a centroid of latent vectors for each word in the latent representation;
  
  analyzing, for a second natural language processing task, a second natural language corpus having a target word, the target word being a word that is not in the first natural language corpus; and
  
  predicting, via a processor, a label for the target word based on the latent representation and the centroid of latent vectors for each word in the latent representation, wherein the predicting comprises iteratively executing an alternating gradient descent algorithm until convergence, the alternating gradient descent algorithm comprising, for each iteration, computing a low dimensional continuous embedding and passing the low dimensional continuous embedding through a multi-layer perceptron.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The computer-readable storage device of claim 16, wherein predicting the label for the target word is further based on a connectionist model, and wherein the connectionist model comprises a classifier that classifies low dimensional representations of words.
  - 18. The computer-readable storage device of claim 16, having additional instructions stored which, when executed by the computing device, result in operations comprising assigning the label to the target word.
  - 19. The computer-readable storage device of claim 16, wherein the second natural language corpus comprises an input sentence;
    - andthe computer-readable storage device has additional instructions stored which, when executed by the computing device, result in operations comprising performing the predicting of the label for each word in the input sentence in parallel.
  - 20. The computer-readable storage device of claim 16, wherein the second natural language processing task is a supertagging task, the supertagging task comprising assigning a lexical entry to the target word.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Microsoft Technology Licensing LLC (Microsoft Corporation)
Original Assignee
AT&T Intellectual Property I LP (AT&T, Inc.)
Inventors
Bangalore, Srinivas, Chopra, Sumit
Primary Examiner(s)
SPOONER, LAMONT M

Application Number

US12/963,126
Publication Number

US 20120150531A1
Time in Patent Office

1,742 Days
Field of Search

704/1, 704/9, 704/10
US Class Current

1/1
CPC Class Codes

G06F 40/40 Processing or translation o...

System and method for learning latent representations for natural language tasks

First Claim

3 Assignments

0 Petitions

Accused Products

Abstract

13 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

System and method for learning latent representations for natural language tasks

First Claim

3 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

13 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links