Text mining apparatus and associated methods
First Claim
1. A method of extracting key terms from text comprising:
- receiving unstructured text documents;
detecting boundaries between sentences in the unstructured text documents to generate a sentence list;
extracting key terms from the unstructured text documents to generate key terms with weightings;
re-counting the extracted key terms to generate key terms with more accurate weightings.
2 Assignments
0 Petitions
Accused Products
Abstract
A method for extracting key terms and associated key terms for use in text mining is provided. The method includes receiving unstructured text documents, such as emails over a customer service system. Term candidates are extracted based on identifying consecutive word strings satisfying a context independency threshold. Term candidates are weighted using mutual information to generate a list of weighted terms. The weighted terms are then recounted. Terms are associated based on Chi-square values. Associated terms can then be used for information retrieval. A user interface can be personalized with individual user profiles.
77 Citations
21 Claims
-
1. A method of extracting key terms from text comprising:
-
receiving unstructured text documents;
detecting boundaries between sentences in the unstructured text documents to generate a sentence list;
extracting key terms from the unstructured text documents to generate key terms with weightings;
re-counting the extracted key terms to generate key terms with more accurate weightings. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A method of performing text mining comprising:
-
receiving unstructured text documents;
generating a list of key terms from the unstructured text documents with weighting and count information;
receiving a query comprising over a user interface; and
calculating Chi-square values between the query and at least some of the key terms to identify associated terms from among the key terms. - View Dependent Claims (12, 13, 14)
-
-
15. A computer readable medium including instructions which, when implemented, cause a computer to perform text mining, the instructions comprising:
-
a key term extraction module adapted to identify a list of key terms in documents of unstructured text; and
a text mining module adapted to receive a query and associate at least a portion of the query with some of the key terms based on Chi-square values to generate associated terms. - View Dependent Claims (16, 17, 18, 19, 20, 21)
-
Specification