×

Automatic clustering of tokens from a corpus for grammar acquisition

  • US 6,317,707 B1
  • Filed: 12/07/1998
  • Issued: 11/13/2001
  • Est. Priority Date: 12/07/1998
  • Status: Expired due to Term
First Claim
Patent Images

1. A grammar learning method from a corpus, comprising:

  • identifying context tokens within the corpus, for each non-context token in the corpus, counting occurrences of predetermined relationships of the non-context token to a context token, generating frequency vectors for each non-context token based upon the counted occurrences, and clustering non-context tokens based upon the frequency vectors, whereby the clusters of non-context tokens form a grammatical model of the corpus.

View all claims
  • 5 Assignments
Timeline View
Assignment View
    ×
    ×