×

Method and system for mining a document containing dirty text

  • US 6,978,275 B2
  • Filed: 08/31/2001
  • Issued: 12/20/2005
  • Est. Priority Date: 08/31/2001
  • Status: Expired due to Fees
First Claim
Patent Images

1. A computer-implemented method for mining a document containing dirty text comprising:

  • removing an instance of dirty text within said document to produce a cleaned document having a content; and

    performing a data mining operation on said cleaned document thereby deriving relevant information from said cleaned document and providing a summary of the content of said document, and scoring and ranking each sentence of said document, wherein said removing further comprises removing an instance of computer code from said document, and removing a table from said document.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×