×

Automated taxonomy generation

  • US 20060004747A1
  • Filed: 06/30/2004
  • Published: 01/05/2006
  • Est. Priority Date: 06/30/2004
  • Status: Active Grant
First Claim
Patent Images

1. A computer readable medium having computer-executable components comprising:

  • (a) a node generator constructed to receive a list of training terms based on a set of training documents, and to generate a first sibling node comprising a first set of probabilities, and to generate a second sibling node comprising a second set of probabilities, the first set of probabilities comprising, for each term in the list of training terms, a probability of the term appearing in a document, and the second set of probabilities comprising, for each term in the list of training terms, a probability of the term appearing in a document;

    (b) a document assigner constructed to associate, based on the first and second set of probabilities, each document of the set of training documents to at least one of a group consisting of the first sibling node, the second sibling node, and a null set, the documents associated with the first sibling node forming a first document set and the documents associated with the second sibling node forming a second document set; and

    (c) a tree manager constructed to communicate at least one of the first document set and the second document set to the node generator to create a binary tree data structure comprising a hierarchy of a plurality of sibling nodes based on recursive performance of the node generator and the document assigner.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×