System and method for processing morphological and syntactical analyses of inputted Chinese language phrases
First Claim
1. A method where a word string is processed by a morphology process comprising the steps of:
- removing one or more affixes from the word string to create a root, the removed affix being one of the affixes on an affix list;
comparing the root to one or more of the words in a vocabulary to find a match, the vocabulary having a plurality of words, each with one or more Hanzi translations, the word in the vocabulary matching the root being the root match; and
storing in the computer memory the Hanzi translation of the root match.
0 Assignments
0 Petitions
Accused Products
Abstract
Phonetic Chinese (Pinyin and BPMF) is entered into a computer system and accurately converted into the Hanzi form. The system has a novel keyboard with diacritic keys (and corresponding ASCII coding) that permit the user to annotate each entered phonetic text syllable with a diacritic that indicates the tone of the syllable. A process executing on the system determines that a syllable has been entered when a diacritic (or delimiter) key is struck. An entered phonetic syllable is then compared to a list of acceptable phonetic syllables and abbreviations. If the entered syllable is on the list, the correctly spelled and accented syllable is stored in memory and displayed on a phonetic portion of a graphical display. The process continues for succeeding syllables until a delimiter is entered. Upon encountering a delimiter, the word string (defined as the string of characters between two delimiters) is analyzed using morphological and syntactical processes and/or a statistical language model to unambiguously determine the proper Hanzi characters that represent the word(s) in the word string. The unique Hanzi translation is stored in memory and displayed on a Hanzi portion of the graphical interface.
-
Citations
11 Claims
-
1. A method where a word string is processed by a morphology process comprising the steps of:
-
removing one or more affixes from the word string to create a root, the removed affix being one of the affixes on an affix list; comparing the root to one or more of the words in a vocabulary to find a match, the vocabulary having a plurality of words, each with one or more Hanzi translations, the word in the vocabulary matching the root being the root match; and storing in the computer memory the Hanzi translation of the root match. - View Dependent Claims (2)
-
-
3. A computer system for processing Chinese language text comprising:
-
an input apparatus for entering a phonetic Chinese language phrase, the phrase having one or more words, each word having one or more syllables, each syllable having one or more characters, the phrase being a string of the characters between a first and second phrase delimiter; an affix list having a plurality of entries being phonetic Chinese affixes; a vocabulary of Chinese words, the vocabulary being a list of a plurality of phonetic Chinese words with a Hanzi translation; and a morphology unit that removes one or more affixes from the phrase to create a root, the removed affix being one of the affixes on the affix list, the morphology unit comparing the root to one or more of the words in the vocabulary to find a match, and storing in a computer memory the Hanzi translation of the word in the vocabulary that matches the root. - View Dependent Claims (4, 5, 6, 7, 8)
-
-
9. A method of syntactically analyzing a Chinese phrase of phonetic syllables comprising the steps of:
-
parsing the Chinese phrase into accented words with one or more syllables marked with a diacritic indicating a tone of the syllable and unaccented words having no syllables marked with a diacritic; matching the the unaccented words with one or more of the entries, the entries being zero or more affixes, function words, and particles on an affix list, each entry having a Hanzi translation; using the respective Hanzi translation to translate the unaccented word into Hanzi. - View Dependent Claims (10, 11)
-
Specification