×

Taxonomy discovery

  • US 20070156665A1
  • Filed: 07/06/2004
  • Published: 07/05/2007
  • Est. Priority Date: 12/05/2001
  • Status: Abandoned Application
First Claim
Patent Images

1. A computer-based method for generating a taxonomy of a collection of documents, comprising:

  • generating a term-by-document matrix for the collection of documents;

    generating a vector for each document in the collection of documents based on the term-by-document matrix;

    identifying document clusters based on similarity comparisons between pairs of the vectors;

    identifying labels for the document clusters based on generalized entities included in documents of the document clusters; and

    storing the labels in an electronic format accessible to a user.

View all claims
  • 8 Assignments
Timeline View
Assignment View
    ×
    ×