×

Textual data classification method and apparatus

  • US 6,507,829 B1
  • Filed: 01/17/2000
  • Issued: 01/14/2003
  • Est. Priority Date: 06/18/1999
  • Status: Expired due to Fees
First Claim
Patent Images

1. A system for assigning a natural language text to a class within a classification system, comprising:

  • inputting a natural language text to be classified;

    identifying chunks within said natural language text having at least a first rank, wherein said chunks comprise n-grams including at least one of a complete natural language word and an abbreviated natural language word;

    assigning a weight vector to identified n-grams for each of multiple classifications determining a count vector for each of said identified n-grams;

    computing a scalar product of each of the count vectors and weight vectors assigned to identified n-grams for each of the multiple classifications;

    computing a sum of said scalar products for each of the multiple classifications;

    assigning the natural language text to the classification for which the highest sum of scalar products is computed;

    wherein weight vectors are represented as sparse vectors;

    wherein the weight vectors are determined by a process comprising initialization and iteration, wherein said classifications are related to a meaning of said chunks, and wherein the assigned classification is related to a meaning of the natural language text.

View all claims
  • 7 Assignments
Timeline View
Assignment View
    ×
    ×