×

Text mining apparatus and associated methods

  • US 7,461,056 B2
  • Filed: 02/09/2005
  • Issued: 12/02/2008
  • Est. Priority Date: 02/09/2005
  • Status: Expired due to Fees
First Claim
Patent Images

1. A method of performing text mining comprising:

  • identifying consecutive words strings in unstructured text documents;

    generating a list of term candidates based on context independency values calculated based on entropy of left context and right context word strings surrounding the consecutive word strings;

    generating a list of key terms from among the list of term candidatesreceiving a query over a user interface;

    calculating Chi-square values wherein Chi-square values are calculated between at least some terms of the query and at least some of the key terms to identify the associated terms from among the key terms using a Chi-square expression based on count information of at least some query terms and at least some key terms in the text documents, wherein the count information includes a number of documents where both query terms and key terms appear, a number of documents where query terms appear but key terms do not appear, a number of documents where query terms appear, a number of documents where at least some query terms do not appear and key terms appear;

    a number of documents where at least some query terms nor key terms appear;

    a number where at least some query terms do not appear, a number where key terms appear, and a number where key terms do not appear; and

    providing content in the unstructured text documents over the user interface based on the query.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×