×

System and method for document categorization

  • US 7,496,567 B1
  • Filed: 09/28/2005
  • Issued: 02/24/2009
  • Est. Priority Date: 10/01/2004
  • Status: Expired due to Fees
First Claim
Patent Images

1. A document categorization method implemented by a computer processor for creating associations between one or more documents with a predefined topic, wherein each said predefined topic comprising:

  • a topic name, a topic threshold and a topic query, wherein said topic threshold comprising a percentage value ranging from zero to one hundred, and wherein said topic query comprising one or more terms, each of said terms comprising;

    a word or a phrase, logical and grouping operators that define relationships between said terms, said method comprising;

    identifying matching documents using said topic query;

    screening said matching documents to produce document-topic associations for each of said documents that also match said topic threshold and said topic query, said screening comprising;

    computing a score for each of said matching documents, wherein said score value equals the similarity of each of said document with said topic;

    sorting said matching documents in order of said computed score; and

    selecting a subset of said matching documents wherein said subset being defined by said topic threshold,wherein said method implements a bimodal classifier that produces document-topic associations that are consistently accurate in terms of precision and recall over document collections that change over time in size and composition.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×