×

Method and apparatus for incorporating metadata in data clustering

  • US 7,809,718 B2
  • Filed: 01/15/2008
  • Issued: 10/05/2010
  • Est. Priority Date: 01/29/2007
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method of clustering a plurality of documents from a data stream comprising:

  • identifying, by a processor, metadata in the plurality of documents;

    emphasizing, by the processor, one or more words corresponding to the metadata;

    generating, by the processor, a single feature vector for each of the plurality of documents based at least in part on the emphasized words by determining a numerical value for each word in one or more of the plurality of documents by determining a Term Frequency Inverse Document Frequency (TFIDF);

View all claims
  • 3 Assignments
Timeline View
Assignment View
    ×
    ×