×

Taxonomy generation for electronic documents

  • US 7,243,092 B2
  • Filed: 08/29/2002
  • Issued: 07/10/2007
  • Est. Priority Date: 12/28/2001
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method comprising:

  • extracting terms from a plurality of electronic documents;

    ranking the extracted terms using two or more term ranking algorithms;

    aggregating rankings of the ranked extracted terms to produce first aggregate rankings, each of the rankings resulting from one of the two or more ranking algorithms;

    selecting terms from the extracted terms, the selected terms having the first aggregate rankings above a pre-determined threshold;

    generating term pairs from the selected terms;

    ranking terms in each term pair based on a relative specificity of the selected terms using two more term pair ranking algorithms;

    aggregating the ranks of the terms in each term pair to produce second aggregate rankings, each of the ranks resulting from the two or more term pair ranking algorithms;

    selecting term pairs having the second aggregate rankings above a pre-determined threshold;

    generating a term hierarchy from the selected term pairs;

    assigning documents to nodes of the term hierarchy based on a number of terms within a branch of the term hierarchy associated with each node that match terms extracted from each document; and

    storing assignments of the documents to the nodes to a memory for retrieval of one or more documents responsive to a search query.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×