SYSTEM AND METHOD FOR TEXT MINING
First Claim
1. A system for facilitating the text mining of a plurality of research documents by a user, the plurality of research documents carrying a non-uniform cost for access by the user, the system comprising:
- (a) a content repository adapted to store the plurality of research documents, the content repository being adapted to receive a query from the user to select a primary collection of the plurality of research documents for text mining, the content repository providing content spread metrics relating to the research documents in the primary collection that enables the user to optionally modify the query to yield a final collection of the plurality of research documents that is optimized for the user; and
(b) a text mining processor for text mining the final collection of research documents to produce a derived text mining data set.
2 Assignments
0 Petitions
Accused Products
Abstract
A multi-user system for text mining a large population of research documents in an efficient and cost-effective fashion includes a content repository, a text mining processor, and a derived data repository that are linked via a user-accessible, central project manager. The content repository includes a data storage device for storing the research documents and a content selection facility for receiving a user-defined query that is able to support cost-related search parameters. The query is utilized by the content selection facility to select an initial collection of documents from the data storage device. Content spread metrics are then displayed through user-intuitive reports to allow for subsequent modification of the search query to yield an optimized document collection. The optimized document collection is then parsed, tagged and clustered by the text mining processor to produce search results that are stored as a data set in the derived data repository.
23 Citations
17 Claims
-
1. A system for facilitating the text mining of a plurality of research documents by a user, the plurality of research documents carrying a non-uniform cost for access by the user, the system comprising:
-
(a) a content repository adapted to store the plurality of research documents, the content repository being adapted to receive a query from the user to select a primary collection of the plurality of research documents for text mining, the content repository providing content spread metrics relating to the research documents in the primary collection that enables the user to optionally modify the query to yield a final collection of the plurality of research documents that is optimized for the user; and (b) a text mining processor for text mining the final collection of research documents to produce a derived text mining data set. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
Specification