×

System and method for automatically classifying text

  • US 20020022956A1
  • Filed: 05/25/2001
  • Published: 02/21/2002
  • Est. Priority Date: 05/25/2000
  • Status: Active Grant
First Claim
Patent Images

1. In a system comprising a plurality of perspectives and a plurality of categories, wherein at least one category is associated with a perspective to reflect associations among related categories, a method for simultaneously classifying at least one document into a plurality of categories, said method comprising:

  • associating a plurality of category features with each said category, wherein each of said category features represents one of a plurality of tokens;

    producing a category vector for each of said plurality of categories, wherein each category vector includes said plurality of category features with a weight corresponding to each category feature, said weight indicative of a degree of association between said category feature and said category;

    associating a plurality of document features with each said document, wherein each of said document features represents one of a plurality of tokens found in said document;

    producing a feature vector for each said document, wherein each feature vector includes said plurality of document features with a count corresponding to each document feature, said count indicative of the number of times said document feature appears in said document;

    multiplying said category vector by said document vector, in accordance with the mathematical convention of multiplication of a vector by a vector, to produce a plurality of category scores for each document; and

    for each perspective, classifying a document into a category provided said category score exceeds a predetermined threshold.

View all claims
  • 25 Assignments
Timeline View
Assignment View
    ×
    ×