Method and system of creating and using Chinese language data and user-corrected data
First Claim
1. A language data structure for use in converting Chinese Pinyin syllables into Chinese Hanzi characters, the data structure comprising:
- a plurality of Hanzi character candidate lists, each list comprising Hanzi character candidates associated with a Pinyin syllable, and each Hanzi character candidate in each list having an index in the list; and
a plurality of language data records, each language data record corresponding to a word having a plurality of Pinyin syllables and comprising a key and a value, wherein the key in each language data record comprises a sequence of indexes and tone information for the Pinyin syllables of the word to which the language data record corresponds, and wherein the value in each language data record comprises a sequence of indexes of Hanzi character candidates, in the lists of candidates respectively associated with the Pinyin syllables of the word, that represent the Pinyin syllables of the word.
5 Assignments
0 Petitions
Accused Products
Abstract
Unique identifiers for each of a plurality of Chinese Pinyin syllables are generated and stored in an array of identifiers. A plurality of Hanzi character candidate lists is also generated, each list including Hanzi character candidates associated with a Pinyin syllable. Each identifier in the array has an array index, and each Hanzi character candidate in each list has a candidate index in the list. For each of a plurality of words having multiple Pinyin syllables, a data record including a key and a value is then generated. In a data record for a word, the key is an array index of the identifier in the array of identifiers and tone information for each of the multiple Pinyin syllables of the word, and the value is a candidate index, in the list of candidates associated with each of the Pinyin syllables, of the candidate that represents each of the Pinyin syllables.
-
Citations
21 Claims
-
1. A language data structure for use in converting Chinese Pinyin syllables into Chinese Hanzi characters, the data structure comprising:
-
a plurality of Hanzi character candidate lists, each list comprising Hanzi character candidates associated with a Pinyin syllable, and each Hanzi character candidate in each list having an index in the list; and
a plurality of language data records, each language data record corresponding to a word having a plurality of Pinyin syllables and comprising a key and a value, wherein the key in each language data record comprises a sequence of indexes and tone information for the Pinyin syllables of the word to which the language data record corresponds, and wherein the value in each language data record comprises a sequence of indexes of Hanzi character candidates, in the lists of candidates respectively associated with the Pinyin syllables of the word, that represent the Pinyin syllables of the word. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method of creating Chinese language data, comprising the steps of:
-
generating identifiers for each of a plurality of Chinese Pinyin syllables;
storing the generated identifiers in an array of identifiers, each identifier in the array of identifiers having an array index;
generating a plurality of Hanzi character candidate lists, each list comprising Hanzi character candidates associated with a Pinyin syllable, and each Hanzi character candidate in each list having a candidate index in the list; and
for each of a plurality of words having multiple Pinyin syllables, generating a data record comprising a key and a value, wherein the key comprises an array index of the identifier in the array of identifiers for each of the multiple Pinyin syllables and tone information for each of the multiple Pinyin syllables, and wherein the value comprises a candidate index, in the list of candidates associated with each of the multiple Pinyin syllables, of the candidate that represents each of the multiple Pinyin syllables. - View Dependent Claims (13, 14, 15, 16, 17)
-
-
18. A system of using Chinese language data for converting Chinese Pinyin syllables into Chinese Hanzi characters, the language data including a plurality of Hanzi character candidate lists, each list comprising Hanzi character candidates associated with a Pinyin syllable, and each Hanzi character candidate in each list having an index in the list, and a plurality of language data records, each language data record corresponding to a word having a plurality of Pinyin syllables and comprising a key and a value, the key in each language data record comprising a sequence of indexes and tone information for the Pinyin syllables of the word to which the language data record corresponds and the value in each language data record comprising a sequence of indexes of Hanzi character candidates, in the lists of candidates respectively associated with the Pinyin syllables of the word, that represent the Pinyin syllables of the word, the system comprising:
-
a keyboard having keys representing a plurality of characters for composing Pinyin syllables;
an input queue configured to receive input Pinyin syllable from the keyboard;
a memory configured to store the plurality of Hanzi character candidate lists and the plurality of data records;
an input processor operatively coupled to the memory and the input queue and configured to segment the input Pinyin syllables into input words, to search the language data records for language data records respectively corresponding to each input word including the input Pinyin syllables, and to convert each input word into the Chinese Hanzi character candidates using the Hanzi character candidate indexes in the corresponding data record;
a display; and
a user interface coupled between the display and the input processor to display the input Pinyin syllables on the display and to replace the input Pinyin syllables with the Chinese Hanzi character candidates when the input Pinyin syllables are converted by the input processor. - View Dependent Claims (19, 20, 21)
-
Specification