TRANSLITERATION DECODING USING A TREE STRUCTURE
First Claim
1. A method for transliteration decoding using a tree structure, comprising:
- generating a tree structure for an input string in a first script system, the tree structure including nodes representing segments of the input string;
identifying segmentation candidates for the input string based on paths of the tree structure, the segmentation candidates segmenting the input string into character groups;
selecting a segmentation candidate based on probabilities of the segmentation candidates predicted by a probabilistic model;
segmenting the input string into character groups that correspond to characters in a second script system;
decoding the character groups in the first script system into the characters in the second script system, the characters forming a word or a word prefix in the second script system; and
outputting the word or the word prefix in the second script system.
2 Assignments
0 Petitions
Accused Products
Abstract
Embodiments are disclosed for transliteration decoding using a tree structure. A method according to some embodiments includes steps of: generating a tree structure for an input string in a first script system, the tree structure including nodes representing segments of the input string; identifying segmentation candidates for the input string based on paths of the tree structure, the segmentation candidates segmenting the input string into character groups; selecting a segmentation candidate based on probabilities of the segmentation candidates predicted by a probabilistic model; segmenting the input string into character groups that correspond to characters in a second script system; decoding the character groups in the first script system into the characters in the second script system, the characters forming a word or a word prefix in the second script system; and outputting the word or the word prefix in the second script system.
-
Citations
20 Claims
-
1. A method for transliteration decoding using a tree structure, comprising:
-
generating a tree structure for an input string in a first script system, the tree structure including nodes representing segments of the input string; identifying segmentation candidates for the input string based on paths of the tree structure, the segmentation candidates segmenting the input string into character groups; selecting a segmentation candidate based on probabilities of the segmentation candidates predicted by a probabilistic model; segmenting the input string into character groups that correspond to characters in a second script system; decoding the character groups in the first script system into the characters in the second script system, the characters forming a word or a word prefix in the second script system; and outputting the word or the word prefix in the second script system. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16)
-
-
17. A computing device, comprising:
-
a processor; and a memory storing instructions which, when executed by the processor, cause the computing device to perform a process including; generating a tree structure for an input string in a first script system, the tree structure including nodes representing segments of the input string; identifying segmentation candidates for the input string based on paths of the tree structure, the segmentation candidates segmenting the input string into character groups; selecting a segmentation candidate based on probabilities of the segmentation candidates predicted by a probabilistic model; segmenting the input string into character groups that correspond to characters in a second script system; decoding the character groups in the first script system into the characters in the second script system, the characters forming a word or a word prefix in the second script system; and outputting the word or the word prefix in the second script system. - View Dependent Claims (18, 19)
-
-
20. A non-transitory machine-readable storage medium comprising a program containing a set of instructions for causing a machine to execute procedures for transliteration decoding, the procedures comprising:
-
generating a tree structure for an input string in a first script system, the tree structure including nodes representing segments of the input string; identifying segmentation candidates for the input string based on paths of the tree structure, the segmentation candidates segmenting the input string into character groups; selecting a segmentation candidate based on probabilities of the segmentation candidates predicted by a probabilistic model; segmenting the input string into character groups that correspond to characters in a second script system; decoding the character groups in the first script system into the characters in the second script system, the characters forming a word or a word prefix in the second script system; and outputting the word or the word prefix in the second script system.
-
Specification