×

Method for categorizing documents by multilevel feature selection and hierarchical clustering based on parts of speech tagging

  • US 20030236659A1
  • Filed: 06/20/2002
  • Published: 12/25/2003
  • Est. Priority Date: 06/20/2002
  • Status: Active Grant
First Claim
Patent Images

1. A method for categorizing documents comprising:

  • tagging parts of speech of words comprising said documents;

    selecting a first set of features based on a first one of said parts of speech;

    grouping said documents into clusters according to their semantic affinity to said first set of features and to each other; and

    refining said clusters into a hierarchy of progressively refined clusters wherein subsequent sets of features are selected based on corresponding said parts of speech.

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×