Registration apparatus for compound-word dictionary
First Claim
1. An apparatus for registering a combination of character strings in a compound-word dictionary, comprising:
- a word dictionary containing a plurality of words;
a compound-word dictionary containing entry words that are combinations of character strings;
segmenting means for segmenting an input combination of character strings into individual character strings by referring to said word dictionary;
occurrence frequency calculating means for calculating an occurrence frequency of each of said segmented individual character strings in said entry words in said compound-word dictionary;
evaluation value calculating means for calculating the evaluation value of said input combination of character strings on the basis of said calculated occurrence frequencies of each of said segmented individual character strings; and
means for determining whether or not to register said combination of character strings in said compound-word dictionary on the basis of said evaluation value, and registering said combination of character strings as an entry word in said compound-word dictionary when it is determined that said combination of character strings should be registered.
1 Assignment
0 Petitions
Accused Products
Abstract
A registration apparatus for a compound-word dictionary automatically and suitably determines whether a combination of character strings must be registered as one entry word in a compound-word dictionary. The registration apparatus comprises a character string segmenter which segments a compound-word into its individual words. The frequency that each individual word occurs in entry words of a compound-word dictionary is determined by an occurrence frequency calculator. An evaluation value calculator calculates an evaluation value based on the occurrence frequencies. Based on this evaluation value, a register determiner then determines if the compound-word is to be registered in the compound-word dictionary. The process for obtaining an evaluation value may depend on the number of component words and contents of the existing dictionary, a physical size limit on the compound-word dictionary and the purpose of use of the dictionary.
64 Citations
12 Claims
-
1. An apparatus for registering a combination of character strings in a compound-word dictionary, comprising:
-
a word dictionary containing a plurality of words; a compound-word dictionary containing entry words that are combinations of character strings; segmenting means for segmenting an input combination of character strings into individual character strings by referring to said word dictionary; occurrence frequency calculating means for calculating an occurrence frequency of each of said segmented individual character strings in said entry words in said compound-word dictionary; evaluation value calculating means for calculating the evaluation value of said input combination of character strings on the basis of said calculated occurrence frequencies of each of said segmented individual character strings; and means for determining whether or not to register said combination of character strings in said compound-word dictionary on the basis of said evaluation value, and registering said combination of character strings as an entry word in said compound-word dictionary when it is determined that said combination of character strings should be registered.
-
-
2. An apparatus for registering a compound word in a compound-word dictionary, comprising:
-
a word dictionary containing a plurality of words; a compound-word dictionary containing entry words that are compound words; segmenting means for segmenting an input compound word into individual words by referring to said word dictionary; occurrence frequency calculating means for calculating an occurrence frequency of each of said segmented individual words in said entry words in said compound-word dictionary; evaluation value calculating means for calculating the evaluation value of said input compound word on the basis of said calculated occurrence frequencies of each of said segmented individual words; and means for determining whether to register said compound word in said compound-word dictionary on the basis of said evaluation value, and registering said compound word as an entry word in said compound-word dictionary when it is determined that said compound word should be registered. - View Dependent Claims (3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
Specification