×

Text-searching code, system and method

  • US 20040054520A1
  • Filed: 07/01/2003
  • Published: 03/18/2004
  • Est. Priority Date: 07/05/2002
  • Status: Abandoned Application
First Claim
Patent Images

1. A computer-executed method for matching a target document in the form of a digitally encoded natural-language text with a plurality of sample texts, comprising the steps of:

  • (a) for each of a plurality of terms selected from one of (i) non-generic words in the document, (ii) proximately arranged word groups in the document, and (iii) a combination of (i) and (ii), determining a selectivity value calculated as the frequency of occurrence of that term in a library of texts in one field, relative to the frequency of occurrence of the same term in one or more other libraries of texts in one or more other fields, respectively, and (b) representing the document as a vector of terms, where the coefficient assigned to each term is a function of the selectivity value determined for that term, (c) determining for each of a plurality of sample texts, a match score related to the number of terms present in or derived from that text that match those in the target document, and (d) selecting one or more of the sample texts having the highest match scores.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×