Cross-lingual discriminative learning of sequence models with posterior regularization
First Claim
1. A computer-implemented method, comprising:
- obtaining, at a computing device having one or more processors, (i) an aligned bi-text for a source language and a target language, the aligned bi-text comprising a plurality of source-target sentence pairs, and (ii) a supervised sequence model for the source language;
labeling, at the computing device, each word of a source side of the aligned bi-text using the supervised sequence model to obtain a labeled source side of the aligned bi-text;
projecting, at the computing device, labels from the labeled source side to a target side of the aligned bi-text to obtain a labeled target side of the aligned bi-text, wherein each label of the labeled source and target sides of the aligned bi-text is a named entity type tag for a particular word;
filtering, at the computing device, the labeled target side of the aligned bi-text for the target language to obtain a filtered target side of the aligned bi-text for training a sequence model for the target language for a named entity segmentation system, wherein the filtering comprises discarding any particular source-target sentence pair when (i) a threshold amount of tokens of the particular source-target sentence pair are unaligned or (ii) a source named entity of the particular source-target sentence pair is not aligned with a target sentence token;
training, at the computing device, the sequence model for the target language using posterior regularization with soft constraints on the filtered target side to learn a set of parameters for the target language;
obtaining, at the computing device, a trained sequence model for the target language using the set of parameters for the target language, the trained sequence model being configured to model a probability distribution over possible labels for text in the target language;
receiving, at the computing device, an input text in the target language;
analyzing, at the computing device, the input text using the trained sequence model for the target language; and
generating, at the computing device, an output based on the analyzing of the input text using the trained sequence model.
2 Assignments
0 Petitions
Accused Products
Abstract
A computer-implemented method can include obtaining (i) an aligned bi-text for a source language and a target language, and (ii) a supervised sequence model for the source language. The method can include labeling a source side of the aligned bi-text using the supervised sequence model and projecting labels from the labeled source side to a target side of the aligned bi-text to obtain a labeled target side of the aligned bi-text. The method can include filtering the labeled target side based on a task of a natural language processing (NLP) system configured to utilize a sequence model for the target language to obtain a filtered target side of the aligned bi-text. The method can also include training the sequence model for the target language using posterior regularization with soft constraints on the filtered target side to obtain a trained sequence model for the target language.
32 Citations
14 Claims
-
1. A computer-implemented method, comprising:
-
obtaining, at a computing device having one or more processors, (i) an aligned bi-text for a source language and a target language, the aligned bi-text comprising a plurality of source-target sentence pairs, and (ii) a supervised sequence model for the source language; labeling, at the computing device, each word of a source side of the aligned bi-text using the supervised sequence model to obtain a labeled source side of the aligned bi-text; projecting, at the computing device, labels from the labeled source side to a target side of the aligned bi-text to obtain a labeled target side of the aligned bi-text, wherein each label of the labeled source and target sides of the aligned bi-text is a named entity type tag for a particular word; filtering, at the computing device, the labeled target side of the aligned bi-text for the target language to obtain a filtered target side of the aligned bi-text for training a sequence model for the target language for a named entity segmentation system, wherein the filtering comprises discarding any particular source-target sentence pair when (i) a threshold amount of tokens of the particular source-target sentence pair are unaligned or (ii) a source named entity of the particular source-target sentence pair is not aligned with a target sentence token; training, at the computing device, the sequence model for the target language using posterior regularization with soft constraints on the filtered target side to learn a set of parameters for the target language; obtaining, at the computing device, a trained sequence model for the target language using the set of parameters for the target language, the trained sequence model being configured to model a probability distribution over possible labels for text in the target language; receiving, at the computing device, an input text in the target language; analyzing, at the computing device, the input text using the trained sequence model for the target language; and generating, at the computing device, an output based on the analyzing of the input text using the trained sequence model. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. A computing device comprising:
-
a non-transitory computer-readable medium having a set of instructions stored thereon; and one or more processors configured execute the set of instructions, which causes the computing device to perform operations comprising; obtaining (i) an aligned bi-text for a source language and a target language, the aligned bi-text comprising a plurality of source-target sentence pairs, and (ii) a supervised sequence model for the source language; labeling each word of a source side of the aligned bi-text using the supervised sequence model to obtain a labeled source side of the aligned bi-text; projecting labels from the labeled source side to a target side of the aligned bi-text to obtain a labeled target side of the aligned bi-text, wherein each label of the labeled source and target sides of the aligned bi-text is named entity type tag for a particular word; filtering the labeled target side to obtain a filtered target side of the aligned bi-text for training a sequence model for the target language for a named entity segmentation system, wherein the filtering comprises discarding any particular source-target sentence pair when (i) the particular source-target sentence pair comprises a named entity having a confidence level less than a confidence threshold or (ii) the particular source-target sentence pair comprises no named entities; training the sequence model for the target language using posterior regularization with soft constraints on the filtered target side to learn a set of parameters for the target language; using the set of parameters for the target language, obtaining a trained sequence model for the target language, the trained sequence model being configured to model a probability distribution over possible labels for text in the target language; receiving an input text in the target language; analyzing the input text using the trained sequence model for the target language; and generating an output based on the analyzing of the input text using the trained sequence model. - View Dependent Claims (10, 11, 12)
-
-
13. A non-transitory, computer-readable medium having instructions stored thereon that, when executed by one or more processors of a computing device, cause the computing device to perform operations comprising:
-
obtaining (i) an aligned bi-text for a source language and a target language, the aligned bi-text comprising a plurality of source-target sentence pairs, and (ii) a supervised sequence model for the source language, the source language being a resource-rich language having greater than an amount of labeled training data required to train the supervised sequence model, the target language being a resource-poor language having less than an amount of labeled training data required to train the sequence model for the target language; labeling a source side of the aligned bi-text using the supervised sequence model to obtain a labeled source side of the aligned bi-text; projecting labels from the labeled source side to a target side of the aligned bi-text to obtain a labeled target side of the aligned bi-text, wherein every label in the labeled source and target sides of the aligned bi-text is a named entity type tag for a particular word; filtering the labeled target side to obtain a filtered target side of the aligned bi-text for training a sequence model for the target language for a named entity segmentation system, wherein the filtering comprises discarding any particular source-target sentence pair when (i) a threshold amount of tokens of the particular source-target sentence pair are unaligned or (ii) a source named entity of the particular source-target sentence pair is not aligned with a target sentence token; training the sequence model for the target language using posterior regularization with soft constraints on the filtered target side to learn a set of parameters for the target language; obtaining a trained sequence model for the target language using the set of parameters for the target language, the trained sequence model being configured to model a probability distribution over possible labels for text in the target language; receiving an input text in the target language; analyzing the input text using the trained sequence model for the target language; and generating an output based on the analyzing of the input text using the trained sequence model. - View Dependent Claims (14)
-
Specification