Morphological analyzer, morphological analysis method, and morphological analysis program
First Claim
1. A morphological analyzer comprising:
- a hypothesis generator for applying a prescribed method of morphological analysis to a text and generating one or more hypotheses as candidate results of the morphological analysis, each hypothesis being a word string with part-of-speech tags, the part-of-speech tags including form information for parts of speech having forms;
a model storage facility storing information for a plurality of part-of-speech n-gram models, at least one of the part-of-speech n-gram models including information about the forms of the parts of speech;
a probability calculator for finding a probability that each said hypothesis will appear in a large corpus of text by using a weighted combination of the information for the part-of-speech n-gram models stored in the model storage facility; and
a solution finder for finding a solution among said hypotheses, based on the probabilities generated by the probability calculator.
1 Assignment
0 Petitions
Accused Products
Abstract
An input text is analyzed into morphemes by using a prescribed morphological analysis procedure to generate word strings with part-of-speech tags, including form information for parts of speech having forms, as hypotheses. The probabilities of occurrence of each hypothesis in a corpus of text are calculated by use of two or more part-of-speech n-gram models, at least one of which takes the forms of the parts of speech into consideration. Lexicalized models and class models may also be used. The models are weighted and the probabilities are combined according to the weights to obtain a single probability for each hypothesis. The hypothesis with the highest probability is selected as the solution to the morphological analysis. By combining multiple models, this method can resolve ambiguity with a higher degree of accuracy than methods that use only a single model.
70 Citations
20 Claims
-
1. A morphological analyzer comprising:
-
a hypothesis generator for applying a prescribed method of morphological analysis to a text and generating one or more hypotheses as candidate results of the morphological analysis, each hypothesis being a word string with part-of-speech tags, the part-of-speech tags including form information for parts of speech having forms;
a model storage facility storing information for a plurality of part-of-speech n-gram models, at least one of the part-of-speech n-gram models including information about the forms of the parts of speech;
a probability calculator for finding a probability that each said hypothesis will appear in a large corpus of text by using a weighted combination of the information for the part-of-speech n-gram models stored in the model storage facility; and
a solution finder for finding a solution among said hypotheses, based on the probabilities generated by the probability calculator. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. A method of morphological analysis comprising:
-
applying a prescribed method of morphological analysis to a text and generating one or more hypotheses as candidate results of the morphological analysis, each hypothesis being a word string with part-of-speech tags, the part-of-speech tags including form information for parts of speech having forms;
calculating probabilities that each said hypothesis will appear in a large corpus of text by using a weighted combination of a plurality of part-of-speech n-gram models, at least one of the part-of-speech n-gram models including information about forms of parts of speech; and
finding a solution among said hypotheses, based on said probabilities. - View Dependent Claims (14, 15, 16, 17, 18, 19, 20)
-
Specification