×

Method for domain identification of documents in a document database

  • US 7,814,105 B2
  • Filed: 05/05/2006
  • Issued: 10/12/2010
  • Est. Priority Date: 10/27/2004
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method for processing a plurality of documents in a document database using a computer-implemented system comprising a processor and a display operatively coupled to the processor, the method comprising:

  • operating the processor to perform the following without requiring pre-computationsa) determining vocabulary words for each document of the plurality thereof;

    b) determining a respective relevancy for each vocabulary word based upon occurrences thereof in the plurality of documents;

    c) determining similarities and differences between the plurality of documents based upon the vocabulary words and their respective relevancies;

    d) defining supersets of vocabulary words based on the determined similarities and differences; and

    e) determining domain identifications for the supersets of vocabulary words using results of a) through d) and only after a) through d); and

    operating the display to display the defined supersets of vocabulary words and their domain identifications, including display of the vocabulary words as being relevant and irrelevant based on occurrences thereof in the plurality of documents.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×