×

Granular knowledge based search engine

  • US 20090119281A1
  • Filed: 11/29/2007
  • Published: 05/07/2009
  • Est. Priority Date: 11/03/2007
  • Status: Abandoned Application
First Claim
Patent Images

1. A system of indexing documents comprising the steps of:

  • a. preprocessing documents to extract words;

    b. then extracting keywords by calculating a TFIDF for each word, wherein the step of calculating a TFIDF further comprises the substeps of;

    i. calculating a term frequency;

    ii. calculating a document frequency;

    iii. calculating a total number of documents in which a term appears at least once;

    c. then comparing the TFIDF for each word with a TFIDF predefined threshold;

    d. then finding keyword association by generating a plurality of keyword sets, wherein the step of generating a plurality of keyword sets further comprises the sub steps of;

    i. filtering keyword sets that do not meet a predefined within distance threshold; and

    ii. filtering keyword sets that do not meet a predefined support threshold, wherein the support threshold is compared to a support level which is proportional to the percentage of documents that contain the keyword set;

    e. then providing a clustering of keyword sets and building a document index having a clustering of keyword sets;

    f. then providing a search result in the form of a document cluster.

View all claims
  • 0 Assignments
Timeline View
Assignment View
    ×
    ×