×

Algorithm for fast disk based text mining

  • US 7,246,117 B2
  • Filed: 03/31/2004
  • Issued: 07/17/2007
  • Est. Priority Date: 03/31/2004
  • Status: Active Grant
First Claim
Patent Images

1. A method of executing a query on a computer for at least one document similar to a specified document, the method comprising:

  • receiving the query;

    forming a reduced query document based on ranks of terms in the specified document, the forming comprising;

    calculating a rank of at least one term in the specified query document,calculating a square of each rank,calculating a normalized rank for each square,sorting a list of said normalized ranks,calculating a partial sum for each normalized rank in the list of normalized ranks, andincluding, in the reduced query document, terms corresponding to a partial sum above a threshold value;

    generating a modified query based on the query and the reduced query document;

    executing the modified query on a data repository to generate a set of results; and

    providing a result from said generated set of results to a user interface.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×