Method and apparatus for data processing and word processing in Chinese using a phonetic Chinese language
First Claim
1. A method of digitally encoding and storing the ideographic Chinese language in a computer, comprising the steps of:
- a) selecting a set of Chinese ideograms to be encoded and stored, each of said Chinese ideograms being pronounced as a monosyllable having a predetermined consonant sound, vowel sound, and vowel tone;
b) selecting one and only one digital representation for each selected ideogram which is usable in said computer for outputting said ideograms;
c) selecting a set of letters for a phonetic Chinese alphabet (PCA) which can be formed into phonetic Chinese words (PCWs) each comprising at least one such PCA letter, which fully identify the sound and tone pronunciation of such selected ideograms and distinguish between all homotone ideograms having identical sound and tone pronunciation in said selected set of Chinese ideograms;
d) selecting one and only one digital representation for each PCA letter which is usable in said computer for outputting said PCA letter; and
e) storing a monosyllabic dictionary in a computer memory in said computer which associates the digital representations of said ideograms and PCA letters so as to identify a one-to-one relationship between the respective digital representations of each selected ideogram and its corresponding PCW including distinguishing between all homotone ideograms having identical sound and tone pronunciation in said selected set of Chinese ideograms.
0 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for data processing and word processing in the Chinese language. A Phonetic Chinese Language (PCL) is defined in which any ideogram can be unambiguously represented by a Phonetic Chinese Word (PCW) no more than four characters in length, each word being composed of letters selected from a defined set of letters that can each be uniquely represented by a 7-bit digital code. Each PCW represents one and only one ideogram and provides the full sound and tone information required to pronounce it. Ambiguities caused by homonyms and homotones are avoided. PCL words are translated into their corresponding ideograms and vice versa by means of a stored monosyllabic dictionary. A method for unambiguously separating a polysyllabic PCL character string into separate words is also provided, which makes it unnecessary to employ a polysyllabic dictionary. Also disclosed is a method of forming an alphagrammic listing from PCL character strings by separating the strings into separate characters and listing them in alphabetical order, provided that homotones and identical ideograms are grouped together even if strict alphabetical ordering of the string would have separated them. The disclosure also includes a keyboard adapted for efficiently entering PCL characters for processing.
232 Citations
33 Claims
-
1. A method of digitally encoding and storing the ideographic Chinese language in a computer, comprising the steps of:
-
a) selecting a set of Chinese ideograms to be encoded and stored, each of said Chinese ideograms being pronounced as a monosyllable having a predetermined consonant sound, vowel sound, and vowel tone; b) selecting one and only one digital representation for each selected ideogram which is usable in said computer for outputting said ideograms; c) selecting a set of letters for a phonetic Chinese alphabet (PCA) which can be formed into phonetic Chinese words (PCWs) each comprising at least one such PCA letter, which fully identify the sound and tone pronunciation of such selected ideograms and distinguish between all homotone ideograms having identical sound and tone pronunciation in said selected set of Chinese ideograms; d) selecting one and only one digital representation for each PCA letter which is usable in said computer for outputting said PCA letter; and e) storing a monosyllabic dictionary in a computer memory in said computer which associates the digital representations of said ideograms and PCA letters so as to identify a one-to-one relationship between the respective digital representations of each selected ideogram and its corresponding PCW including distinguishing between all homotone ideograms having identical sound and tone pronunciation in said selected set of Chinese ideograms. - View Dependent Claims (2, 20, 21)
-
-
3. A method of digitally encoding and storing the ideographic Chinese language in a computer, comprising the steps of:
-
1) a) selecting a set of Chinese ideograms to be encoded and stored, each of said Chinese ideograms being pronounced as a monosyllable having a predetermined consonant sound, vowel sound, and vowel tone; b) selecting one and only one digital representation for each selected ideogram which is usable in said computer for outputting said ideogram; c) selecting a set of letters for a phonetic Chinese alphabet (PCA) which can be formed into phonetic Chinese words (PCWs) each comprising at least one such PCA letter, which fully identify the sound and tone pronunciation of such selected ideograms; d) selecting one and only one digital representation for each PCA letter which is usable in said computer for outputting said PCA letter; and e) storing a monosyllabic dictionary in a computer memory in said computer which associates the digital representations of said ideograms and PCA letters so as to identify a one-to-one relationship between the respective digital representations of each selected ideograms and its corresponding PCW; 2) wherein said PCA letters represent the following language elements; a) a plurality of vowels; b) a plurality of tones with which said vowels are pronounced; and c) a plurality of consonants; and 3) wherein said vowels include a) a plurality of voweltones, each of which represents a given vowel sound pronounced with a given tone, and b) a plurality of semi-consonants, each of which represents a given vowel sound irrespective of tone. - View Dependent Claims (4, 5, 10, 11, 12)
-
-
6. A method of digitally encoding and storing the ideographic Chinese language in a computer, comprising the steps of:
-
1) a) selecting a set of Chinese ideograms to be encoded and stored, each of said Chinese ideograms being pronounced as a monosyllable having a predetermined consonant sound, vowel sound, and vowel tone; b) selecting one and only one digital representation for each selected ideogram which is usable in said computer for outputting said ideogram; c) selecting a set of letters for a phonetic Chinese alphabet (PCA) which can be formed into phonetic Chinese words (PCWs) each comprising at least one PCA letter, which fully identify the sound and tone pronunciation of such selected ideograms; d) selecting one and only one digital representation for each PCA letter which is usable in said computer for outputting said PCA letter; and e) storing a monosyllabic dictionary in a computer memory in said computer which associates the digital representations of said ideograms and PCA letters so as to identify a one-to-one relationship between the respective digital representations of each selected ideogram and its corresponding PCW; 2) wherein said PCA letters represent the following language elements; a) a plurality of vowels; b) a plurality of tones with which said vowels are pronounced; and c) a plurality of consonants; and 3) wherein said consonants include a) a plurality of short consonants, each of which represents a respective consonant sound; b) a plurality of long consonants, each of which represents a respective consonant sound pronounced with a respective vowel sound; and c) a silent zero consonant. - View Dependent Claims (13, 14, 15)
-
-
7. A method of digitally encoding and storing the ideographic Chinese language in a computer, comprising the steps of:
-
1) a) selecting a set of Chinese ideograms to be encoded and stored, each of said Chinese ideograms being pronounced as a monosyllable having a predetermined consonant sound, vowel sound, and vowel tone; b) selecting one and only one digital representation for each selected ideogram which is usable in said computer for outputting said ideogram; c) selecting a set of letters for a phonetic Chinese alphabet (PCA) which can be formed into phonetic Chinese words (PCWs) each comprising at least one PCA letter, which fully identify the sound and tone pronunciation of such selected ideograms; d) selecting one and only one digital representation for each PCA letter which is usable in said computer for outputting said PCA letter; and e) storing a monosyllabic dictionary in a computer memory in said computer which associates the digital representation of said ideograms and PCA letters so as to identify a one-to-one relationship between the respective digital representations of each selected ideogram and its corresponding PCW; 2) wherein said PCA letters represent the following language elements; a) a plurality of vowels; b) a plurality of tones with which said vowels are pronounced; and c) a plurality of consonants; 3) wherein each such PCW has the form TS+Q, wherein a) TS is a tone-syllable having one of the forms CV, CSV, SV, and V;
C being a consonant, S being a semi-consonant, and V being a voweltone; andb) Q is a generalized tone-syllable modifier which indicates meaning for distinguishing between homotones. - View Dependent Claims (8, 9, 16, 17, 18, 19)
-
-
22. A text processing method which includes digitally encoding and storing the ideographic Chinese language in a computer, comprising the steps of:
-
a) selecting a set of Chinese ideograms to be encoded and stored, each of said Chinese ideograms being pronounced as a monosyllable having a predetermined consonant sound, vowel sound, and vowel tone; b) selecting one and only one digital representation for each selected ideogram which is usable in said computer for outputting said ideogram; c) selecting a set of letters for a phonetic Chinese alphabet (PCA) which can be formed into phonetic Chinese words (PCWs) each comprising at least one such PCA letter, which fully identify the sound and tone pronunciation of such selected ideograms and distinguish between all homotone ideograms having identical sound and tone pronunciation in said selected set of Chinese ideograms; d) selecting one and only one digital representation for each PCA letter which is usable in said computer for outputting said PCA letter; e) storing a monosyllabic dictionary in a computer memory in said computer which associates the digital representations of said ideograms and PCA letters so as to identify a one-to-one relationship between the respective digital representations of each selected ideograms and its corresponding PCW including distinguishing between all homotone ideograms having identical sound and tone pronunciation in said selected set of Chinese ideograms; entering a continuous string of phonetic Chinese language characters into said computer memory, said string of characters including at least two groups of characters, each group of characters defining a phonetic Chinese word of variable character length; and processing said continuous string in said computer memory so as to accurately determine the beginning and end of each phonetic Chinese word in said string. - View Dependent Claims (23)
-
-
24. A method of creating an alphagrammic listing of a set of word strings, which includes digitally encoding and storing the ideographic Chinese language in a computer, the method comprising the steps of:
-
a) selecting a set of Chinese ideograms to be encoded and stored, each of said Chinese ideograms being pronounced as a monosyllable having a predetermined consonant sound, vowel sound, and vowel tone; b) selecting one and only one digital representation for each selected ideogram which is usable in said computer for outputting said ideogram; c) selecting a set of letters for a phonetic Chinese alphabet (PCA) which can be formed into phonetic Chinese words (PCWs) each comprising at last one such PCA letter, which fully identify the sound and tone pronunciation of such selected ideograms and distinguish between all homotone ideograms having identical sound and tone pronunciation in said selected set of Chinese ideograms; d) selecting one and only one digital representation for each PCA letter which is usable in said computer for outputting said PCA letter; e) storing a monosyllabic dictionary in a computer memory in said computer which associates the digital representations of said ideograms and PCA letters so as to identify a one-to-one relationship between the respective digital representations of each selected ideogram and its corresponding PCW including distinguishing between all homotone ideograms having identical sound and tone pronunciation in said selected set of Chinese ideograms; each word string including a plurality of phonetic Chinese words, each phonetic Chinese word (PCW) representing one and only one Chinese ideogram and providing the sound and tone information required to pronounce that ideogram, and distinguishing between all homotone ideograms having identical sound and tone pronunciation in said selected set of Chinese ideograms, said PCA letters having a predetermined alphabetical order, said method of creating an alphagrammic listing comprising the steps of; 1) storing said set of word strings in the computer memory; and 2) sorting said set of word strings in alphagrammic order, wherein 3) said word strings are listed in the alphabetical order of the characters in that word string; 4) said alphabetical order being overridded to the extend that; (a) all strings whose corresponding first Chinese ideograms are identical are listed together for purposes of ordering said strings; and (b) all words in said word strings pronounced with the same sound and tone are listed together for purposes of ordering said strings; 5) all strings listed together in said steps (a) and (b) being listed in alphabetical order with respect to one another. - View Dependent Claims (32)
-
-
25. A method of processing character strings, comprising
a) entering a string of letters of a phonetic Chinese alphabet (PCA) in a computer memory; - wherein
1) said PCA includes respective pluralities of voweltones (V), semi-consonants (S), and consonants (C), and including a zero consonant (Z); 2) said string of letters includes at least two separate phonetic Chinese words (PCWs), each said PCW having the form TS+Q, wherein TS is a tone-syllable having one of the forms CV, CSV, SV and V, and Q is a generalized meaning-indicating modifier having one of two forms, namely a PCA letter and the omission of any PCA letter;
provided that Q cannot take the form of one voweltone (RV) which is employed to indicate the retroflex ideogram when it occurs at the end of a character string;3) each of said PCWs represents one and only one Chinese ideogram and provides the sound and tone information required to pronounce that ideogram; and 4) each non-initial PCW that has the form V+Q is preceded in such string by the zero consonant, and each noninitial PCW that has the form SV+Q is preceded in such string by the zero consonant whenever such last-mentioned PCW follows a PCW having one of the forms CVC and CSVC; and b) separating said string in said computer memory unambiguously into said separate phonetic Chinese words included therein. - View Dependent Claims (26, 27, 33)
- wherein
-
28. A method of encoding and storing Chinese ideograms in a computer, comprising the steps of:
-
a) selecting a set of Chinese ideograms to be encoded and stored, each of said Chinese ideograms being pronounced as a monosyllable having a predetermined consonant sound, vowel sound, and vowel tone; b) selecting a set of letters for a phonetic Chinese alphabet (PCA) which can be formed into phonetic Chinese words (PCWs) each comprising at least one such PCA letter, which fully identify the sound and tone pronunciation of such selected ideograms; c) selecting one and only one 7-bit digital representation for each selected PCA letter and each selected ideogram which are usable in said computer for outputting said ideograms and said PCA letters; d) selecting one and only one phonetic Chinese word (PCW) composed of PCA letters for uniquely identifying each selected ideogram; and e) storing a monosyllabic dictionary in a computer memory in said computer which associates the digital representations of said ideograms and PCA letters so as to identify a one-to-one relationship between the respective digital representations of each selected ideograms and its corresponding PCW, including distinguishing between all homotone ideograms having identical sound and tone pronunciation in said selected set of Chinese ideograms. - View Dependent Claims (29, 30, 31)
-
Specification