DICTIONARY CREATION DEVICE, WORD GATHERING METHOD AND RECORDING MEDIUM
First Claim
1. A dictionary creation device, comprising:
- an input/output process recording means for recording information indicating the process of inputting and outputting input words and output words output by the input words, in a dictionary growth process for gathering words by repeatedly accepting input of words, outputting words related to the input words from document data, adding to the input words words output until a prescribed condition is satisfied, and outputting words related to the input words from document data;
a cluster classifying means for classifying words that input word or output word becomes the same into same cluster among words gathered by the dictionary growth process based on information recorded in the input/output process recording means;
a similarity determining means for determining whether or not words in a cluster are words of the same type as input words for which input was initially received, for each cluster classified by the cluster classifying means, based on the number of turns required to output each word in the cluster from the input word, by referencing information recorded in the input/output process recording means; and
a gathered word output means for linking together and outputting words gathered by the dictionary growth process, clusters to which the words belong and information indicating whether or not the words comprising the cluster are words of the same type of the input words for which input was initially received.
1 Assignment
0 Petitions
Accused Products
Abstract
When gathering words through a dictionary growth process, a dictionary growth unit (102) stores information indicating through what process of input and output a word has been gathered in a gathering process memory unit (107). Then, a clustering unit (103) classifies the word that has been gathered by the dictionary growth process into clusters on the basis of information recorded in the gathering process memory unit (107). Next, a type determination unit (104) determines whether a word comprising a cluster is of the same type as a seed word or of a different type, for each cluster into which the word has been classified, on the basis of information recorded in the gather process memory unit (107). In addition, an output unit (105) associates information indicating the gathered word, the cluster to which the word belongs and whether the cluster is of the same type as the seed word or of a different type, and displays such.
31 Citations
11 Claims
-
1. A dictionary creation device, comprising:
-
an input/output process recording means for recording information indicating the process of inputting and outputting input words and output words output by the input words, in a dictionary growth process for gathering words by repeatedly accepting input of words, outputting words related to the input words from document data, adding to the input words words output until a prescribed condition is satisfied, and outputting words related to the input words from document data; a cluster classifying means for classifying words that input word or output word becomes the same into same cluster among words gathered by the dictionary growth process based on information recorded in the input/output process recording means; a similarity determining means for determining whether or not words in a cluster are words of the same type as input words for which input was initially received, for each cluster classified by the cluster classifying means, based on the number of turns required to output each word in the cluster from the input word, by referencing information recorded in the input/output process recording means; and a gathered word output means for linking together and outputting words gathered by the dictionary growth process, clusters to which the words belong and information indicating whether or not the words comprising the cluster are words of the same type of the input words for which input was initially received. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A word gathering method, comprising:
-
an input/output process recording step for recording information indicating the process of inputting and outputting input words and output words output by the input words, in a dictionary growth process for gathering words by repeatedly accepting input of words, outputting words related to the input words from document data, adding to the input words words output until a prescribed condition is satisfied, and outputting words related to the input words from document data; a cluster classifying step for classifying words that input word or output word becomes the same into same cluster among words gathered by the dictionary growth process based on information recorded in the input/output process recording step; a similarity determining step for determining whether or not words in a cluster are words of the same type as input words for which input was initially received, for each cluster classified by the cluster classifying step, based on the number of turns required to output each word in the cluster from the input word, by referencing information recorded in the input/output process recording step; and a gathered word output step for linking together and outputting words gathered by the dictionary growth process, clusters to which the words belong and information indicating whether or not the words comprising the cluster are words of the same type of the input words for which input was initially received.
-
-
11. A computer-readable recording medium on which is recorded a program that causes a computer to function as:
-
an input/output process recording means for recording information indicating the process of inputting and outputting input words and output words output by the input words, in a dictionary growth process for gathering words by repeatedly accepting input of words, outputting words related to the input words from document data, adding to the input words words output until a prescribed condition is satisfied, and outputting words related to the input words from document data; a cluster classifying means for classifying words that input word or output word becomes the same into same cluster among words gathered by the dictionary growth process based on information recorded in the input/output process recording means; a similarity determining means for determining whether or not words in a cluster are words of the same type as input words for which input was initially received, for each cluster classified by the cluster classifying means, based on the number of turns required to output each word in the cluster from the input word, by referencing information recorded in the input/output process recording means; and a gathered word output means for linking together and outputting words gathered by the dictionary growth process, clusters to which the words belong and information indicating whether or not the words comprising the cluster are words of the same type of the input words for which input was initially received.
-
Specification