Document tabulation method and apparatus and medium for storing computer program therefor
First Claim
Patent Images
1. In a text mining system having a database to store a plurality of documents, a processing unit, a display unit and a user input device;
- a document tabulation support method for generating a document tabulation axis containing a plurality of categories for document tabulation, wherein the document tabulation classifies the plurality of documents into the plurality of categories to create a table, the document tabulation support method comprising the steps of;
displaying on the display unit a plurality of terms extracted from the plurality of documents stored in the database;
accepting in the user input device a first user input to select at least a part of the displayed, extracted terms;
extracting co-occurrence words of the selected, extracted terms from the plurality of documents, setting the co-occurrence words as a plurality of category candidates and evaluating a co-occurrence strength between the plurality of category candidates and the extracted terms;
displaying on the display unit at least a part of the category candidates in the order of the co-occurrence strength;
accepting in the user input device a second user input to select at least a part of the displayed category candidates; and
in the processing unit, determining the category candidates selected based on the first user input as categories and generating a document tabulation axis by using the categories.
1 Assignment
0 Petitions
Accused Products
Abstract
Aids in creating axes from the bottom up using a huge volume of document data and, during the process, aids the user to discover an analytical point of view. The following processing is performed: (1) the system extracts search formula candidates for categories (referred to as category candidates) and the user selects from among the extracted category candidates; (2) the system creates axes from the category candidates selected by the user; and (3) the user determines a name of each axis (i.e., name of analytical point of view). Of these steps, the system aids in the step (1).
-
Citations
18 Claims
-
1. In a text mining system having a database to store a plurality of documents, a processing unit, a display unit and a user input device;
- a document tabulation support method for generating a document tabulation axis containing a plurality of categories for document tabulation, wherein the document tabulation classifies the plurality of documents into the plurality of categories to create a table, the document tabulation support method comprising the steps of;
displaying on the display unit a plurality of terms extracted from the plurality of documents stored in the database;
accepting in the user input device a first user input to select at least a part of the displayed, extracted terms;
extracting co-occurrence words of the selected, extracted terms from the plurality of documents, setting the co-occurrence words as a plurality of category candidates and evaluating a co-occurrence strength between the plurality of category candidates and the extracted terms;
displaying on the display unit at least a part of the category candidates in the order of the co-occurrence strength;
accepting in the user input device a second user input to select at least a part of the displayed category candidates; and
in the processing unit, determining the category candidates selected based on the first user input as categories and generating a document tabulation axis by using the categories. - View Dependent Claims (2, 3, 4, 5, 6)
- a document tabulation support method for generating a document tabulation axis containing a plurality of categories for document tabulation, wherein the document tabulation classifies the plurality of documents into the plurality of categories to create a table, the document tabulation support method comprising the steps of;
-
7. A text mining system for aiding a generation of a document tabulation axis containing a plurality of categories for document tabulation, wherein the document tabulation classifies a plurality of documents into the plurality of categories to create a table, the text mining system comprising:
-
a database to store a plurality of documents;
a processing unit to select a plurality of categories for the document tabulation axis by using the plurality of documents read from the database;
a display unit; and
a user input device to accept a user input;
wherein, for extracted terms selected by a first input from the user input device, the processing unit extracts co-occurrence words from the plurality of documents to determine a plurality of category candidates, evaluates a co-occurrence strength between the plurality of the category candidates and the extracted terms, determines as categories at least a part of the category candidates that is selected by a second input from the user input device, and generates a document tabulation axis by using the categories;
wherein the display unit displays the extracted terms and also displays the plurality of category candidates in the order of the evaluated co-occurrence strength. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. In a text mining system having a database to store a plurality of documents, a processing unit, a display unit and a user input device;
- a document tabulation support program for generating a document tabulation axis containing a plurality of categories for document tabulation, wherein the document tabulation classifies the plurality of documents into the plurality of categories to create a table, the document tabulation support program comprising;
a first step of displaying on the display unit a plurality of terms extracted from the plurality of documents stored in the database;
a second step of accepting in the user input device a first user input to select at least a part of the displayed, extracted terms;
a third step of causing the processing unit to extract co-occurrence words of the selected, extracted terms from the plurality of documents, to set the co-occurrence words as a plurality of category candidates and to evaluate a co-occurrence strength between the plurality of category candidates and the extracted terms;
a fourth step of displaying on the display unit at least a part of the category candidates in the order of the co-occurrence strength;
a fifth step of accepting in the user input device a second user input to select at least a part of the displayed category candidates;
a sixth step of causing the processing unit to determine the category candidates selected based on the first user input as categories; and
a seventh step of causing the processing unit to create a document tabulation axis by using the categories. - View Dependent Claims (14, 15, 16, 17, 18)
- a document tabulation support program for generating a document tabulation axis containing a plurality of categories for document tabulation, wherein the document tabulation classifies the plurality of documents into the plurality of categories to create a table, the document tabulation support program comprising;
Specification