×

Automatic labeling of unlabeled text data

  • US 6,697,998 B1
  • Filed: 06/12/2000
  • Issued: 02/24/2004
  • Est. Priority Date: 06/12/2000
  • Status: Expired due to Term
First Claim
Patent Images

1. A method of automated labeling of unlabeled text data comprising the steps of:

  • establishing a document collection as a reference answer set;

    converting members of the answer set to vectors representing centroids of unknown groups of unlabeled text data;

    clustering unlabeled text data relative to said centroids by a nearest neighbor algorithm;

    assigning an ID to each said centroid; and

    labeling each of the unlabeled text data documents with said ID of the answer in the cluster to which the unlabeled text data document has been assigned by said clustering step.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×