×

Automated taxonomy generation

  • US 7,266,548 B2
  • Filed: 06/30/2004
  • Issued: 09/04/2007
  • Est. Priority Date: 06/30/2004
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer readable medium having computer-executable components comprising:

  • (a) a node generator constructed to receive a list of training terms based on a set of training documents, and to generate a first sibling node comprising a first set of probabilities, and to generate a second sibling node comprising a second set of probabilities, the first set of probabilities comprising, for each term in the list of training terms, a probability of the term appearing in a document, and the second set of probabilities comprising, for each term in the list of training terms, a probability of the term appearing in a document, wherein the first sibling node and the second sibling node are generated by dividing from a parent node;

    (b) a document assigner constructed to associate, based on the first and second set of probabilities, each document of the set of training documents to at least one of a group consisting of the first sibling node, the second sibling node, and a null set, the documents associated with the first sibling node forming a first document set and the documents associated with the second sibling node forming a second document set; and

    (c) a tree manager constructed to communicate at least one of the first document set and the second document set to the node generator to create a binary tree data structure comprising a hierarchy of a plurality of sibling nodes based on recursive performance of the node generator and the document assigner, and(d) a document sorter constructed to associate a new document to at least one node of the plurality of sibling nodes based on respective sets of probabilities associated with the nodes, andwherein the tree manager stores the binary tree data structure for access by the document sorter.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×