×

Text representation and method

  • US 7,003,516 B2
  • Filed: 05/15/2003
  • Issued: 02/21/2006
  • Est. Priority Date: 07/03/2002
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-executed method for representing a natural-language document in a vector form suitable for text manipulation operations, comprising(a) for each of a plurality of terms composed of non-generic words and, optionally, proximately arranged word groups in the document, determining a selectivity value calculated as the frequency of occurrence of the term in a library of texts in one field, relative to the frequency of occurrence of the same term in one or more other libraries of texts in one or more other fields, respectively, and(b) representing the document as a vector of terms, where a coefficient assigned to each term is a function of the selectivity value determined for the term.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×