×

Weakly supervised part-of-speech tagging with coupled token and type constraints

  • US 9,311,299 B1
  • Filed: 07/31/2013
  • Issued: 04/12/2016
  • Est. Priority Date: 07/31/2013
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method comprising:

  • obtaining a word in a first language;

    selecting a first, token-level set of one or more parts-of-speech tags to associate with the word in the first language, comprising;

    identifying a translation of the word in a second language, andselecting, as the first, token-level set of one or more parts-of-speech tags to associate with the word in the first language, a set of one or more parts-of-speech tags that are associated with the translation of the word in the second language;

    selecting a second, token-level set of one or more parts-of-speech tags to associate with the word in the first language, comprising;

    when the word in the first language has no associated part-of-speech tag indicated for the word in the first language in a tag dictionary, selecting, as the second, token-level set of the one or more parts of speech tags, all of one or more of the parts-of-speech tags that (i) are in the first, token-level set of one or more parts-of-speech tags, and (ii) are associated as parts-of-speech tags with words in the tag dictionary, orwhen the word in the first language has one or more associated parts-of-speech tags indicated for the word in the first language in the tag dictionary, selecting, as the second, token-level set of the one or more parts-of-speech-tags, the one or more parts-of-speech tags that (I) are in the first, token-level set of one or more parts-of-speech tags, and (II) are indicated in the tag dictionary as associated with the word in the first language; and

    providing the word and the second, token-level set of the one or more parts-of-speech tags as training data for training a machine-based part-of-speech tagger.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×