Method and apparatus for training transliteration model and parsing statistic model, method and apparatus for transliteration
First Claim
1. An apparatus for training a parsing statistic model, which is to be used in transliteration between a single-syllable language and a multi-syllable language and includes sub-syllable parsing probabilities of said multi-syllable language, comprising:
- a corpus inputting unit configured to input a bilingual proper name list as corpus, said bilingual proper name list includes a plurality of proper names of said multi-syllable language and corresponding proper names of said single-syllable language respectively;
a rule parsing unit configured to parse said plurality of proper names of multi-syllable language in said bilingual proper name list into sub-syllable sequences using parsing rules;
a parsing determining unit configured to determine whether a parsing of said proper name of multi-syllable language is correct according to the corresponding proper name of said single-syllable language in said bilingual proper name list; and
a parsing statistic model training unit configured to train said parsing statistic model base on the result of parsing that is determined as correct.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention provides a method and apparatus for training a parsing statistic model, a method and apparatus for transliteration. Said parsing statistic model is to be used in transliteration between a single-syllable language and a multi-syllable language and includes sub-syllable parsing probabilities of said multi-syllable language. Said method for training the parsing statistic model comprising: inputting a bilingual proper name list as corpus, said bilingual proper name list includes a plurality of proper names of said multi-syllable language and corresponding proper names of said single-syllable language respectively; parsing each of said plurality of proper names of multi-syllable language in said bilingual proper name list into a sub-syllable sequence using parsing rules; determining whether said parsing is correct according to the corresponding proper name of said single-syllable language in said bilingual proper name list; and training said parsing statistic model base on the result of parsing that is determined as correct.
298 Citations
14 Claims
-
1. An apparatus for training a parsing statistic model, which is to be used in transliteration between a single-syllable language and a multi-syllable language and includes sub-syllable parsing probabilities of said multi-syllable language, comprising:
-
a corpus inputting unit configured to input a bilingual proper name list as corpus, said bilingual proper name list includes a plurality of proper names of said multi-syllable language and corresponding proper names of said single-syllable language respectively; a rule parsing unit configured to parse said plurality of proper names of multi-syllable language in said bilingual proper name list into sub-syllable sequences using parsing rules; a parsing determining unit configured to determine whether a parsing of said proper name of multi-syllable language is correct according to the corresponding proper name of said single-syllable language in said bilingual proper name list; and a parsing statistic model training unit configured to train said parsing statistic model base on the result of parsing that is determined as correct. - View Dependent Claims (2, 3, 4)
-
-
5. An apparatus for transliteration from a single-syllable language to a multi-syllable language, comprising:
-
a syllable sequence obtaining unit configured to obtain a syllable sequence corresponding to a word of said single-syllable language to be transliterated; a transliteration model including translation relationships between syllables of said single-syllable language and sub-syllables of said multi-syllable language and their translation probabilities respectively; a sub-syllable translating unit configured to obtain at least one sub-syllable of said multi-syllable language corresponding to each syllable in said syllable sequence obtained by said syllable sequence obtaining unit and its translation probability by using said transliteration model; a parsing statistic model including sub-syllable parsing probabilities of said multi-syllable language; a searching unit configured to search for a sub-syllable sequence having the highest probability corresponding to said syllable sequence as a transliteration result based on said parsing statistic model and said at least one sub-syllable of said multi-syllable language corresponding to each syllable in said syllable sequence and its translation probability. - View Dependent Claims (6, 7, 8)
-
-
9. An apparatus for transliteration from a multi-syllable language to a single-syllable language, comprising:
-
a sub-syllable parsing unit configured to parse a word of said multi-syllable language that needs to be transliterated into a sub-syllable sequence; a transliteration model including translation relationships between syllables of said single-syllable language and sub-syllables of said multi-syllable language and their translation probabilities respectively; a syllable translating unit configured to obtain at least one syllable of said single-syllable language corresponding to each sub-syllable in said sub-syllable sequence and its translation probability according to said transliteration model; a character translating unit configured to obtain a character corresponding to each said syllable of single-syllable language; a language model including character adjacent probabilities of said single-syllable language; a searching unit configured to search for a character sequence having the highest probability corresponding to said sub-syllable sequence as a transliteration result based on said language model and said at least one syllable of said single-syllable language corresponding to each sub-syllable in said sub-syllable sequence and its translation probability obtained by said syllable translating unit. - View Dependent Claims (10, 11, 12, 13, 14)
-
Specification