Data compressing apparatus, data decompressing apparatus, data compressing method, data decompressing method, and program recording medium
First Claim
1. A data compressing apparatus for compressing original text data containing character information in which a single character is expressed by plural-byte information, comprising:
- phonetic text data producing means for producing phonetic text data equal to the original text data such that the character information contained in the plural-byte coded original text data is replaced by phonetic character information, expressed as a phonogram in 1-byte code characters in which the number of kinds of characters is less than the number of kinds of characters in the original text data; and
compressing means for compressing the phonetic text data produced by said phonetic text data producing means.
1 Assignment
0 Petitions
Accused Products
Abstract
A data compressing/decompressing apparatus is suitable for compressing data containing plural-byte characters, for instance, a Japanese language text. The data compressing/decompressing apparatus owns a homonym dictionary in which KANJI-character idioms, KANJI-character-reading, and homonym discrimination information are stored in correspondence with each other. This data compressing/decompressing apparatus converts a KANJI-character idiom contained in declarative sentence data into phonetic data, and further compresses this phonetic data to output the compressed phonetic data. The phonetic data is such data that this KANJI-character idiom is replaced by information constituted by character number discrimination information indicative of the character number about KANJI-character idiom reading, the reading thereof, and the homonym discrimination information thereof.
-
Citations
36 Claims
-
1. A data compressing apparatus for compressing original text data containing character information in which a single character is expressed by plural-byte information, comprising:
-
phonetic text data producing means for producing phonetic text data equal to the original text data such that the character information contained in the plural-byte coded original text data is replaced by phonetic character information, expressed as a phonogram in 1-byte code characters in which the number of kinds of characters is less than the number of kinds of characters in the original text data; and compressing means for compressing the phonetic text data produced by said phonetic text data producing means.
-
-
2. A data compressing apparatus for compressing original text data containing character information in which a single character is expressed by plural-byte information, comprising:
-
phonetic character information storing means for storing phonetic character information expressed with phonograms in 1-byte code characters in which the number of kinds of characters is less than the number of kinds of characters in the original text data and equivalent to said character information; retrieving/reading means for retrieving the word information to be converted stored in said phonetic character information storing means from said original text data, and also for reading phonetic character information corresponding to the retrieved word information to be converted from said phonetic character information storing means; phonetic text data producing means for producing phonetic text data by replacing the word information to be converted within said plural-byte coded original text data retrieved by said retrieving/reading means with word replacement information to be converted, which contains the phonetic character information read by said retrieving/reading means; intermediate code table forming means for forming an intermediate code table used to adapt an intermediate code to an information element utilized in the phonetic text data produced by said phonetic text data producing means; intermediate code text data by producing means for producing intermediate code text data by converting the respective information elements for constructing said phonetic text data into the corresponding intermediate codes by using the intermediate code table produced by said intermediate code producing means; and compressing means for compressing the intermediate code text data produced by said intermediate code text data producing means. - View Dependent Claims (3, 4, 5, 6, 7, 8, 9)
-
-
10. A data compressing apparatus for compressing original text data containing character information in which a single character is expressed by plural-byte information, comprising:
-
phonetic character information storing means for storing phonetic character information expressed with phonograms in 1-byte code characters in which the number of kinds of characters is less than the number of kinds of characters in the original text data and equivalent to said character information; retrieving/reading means for retrieving the word information to be converted stored into said phonetic character information storing means from said original text data, and also for reading phonetic character information corresponding to the retrieved word information to be converted from said phonetic character information storing means; phonetic text data producing means for producing phonetic text data by replacing the word information to be converted within said plural-byte coded original text data retrieved by said retrieving/reading means by word replacement information to be converted, which contains the phonetic character information read by said retrieving/reading means; and compressing means for compressing the phonetic text data produced by said phonetic text data producing means.
-
-
11. A data decompressing apparatus comprising:
-
decompressing means for decompressing compression text data; and original text data producing means for producing original text data equal to data for constituting an original of said compression text data by converting phonetic character information, restored by said decompressing means, expressed with phonograms composed of 1-byte code characters in which the number of kinds of characters is less than the number of kinds of characters in decompressed text data, into character information corresponding thereto.
-
-
12. A data decompressing apparatus comprising:
-
phonetic character information storing means for storing phonetic character information equal to information expressed with phonograms in 1-byte code characters in which the number of kinds of characters is less than the number of kinds of characters in decompressed text data and equivalent to word information to be converted being constructed of one or plural character information; decompressing means for decompressing compression text data to output intermediate code text data; phonetic text data producing means for producing phonetic text data by replacing each of intermediate codes contained in the intermediate code text data outputted by said decompressing means by information adapted to the intermediate code in an intermediate code table related to said compression text data; and original text data producing means for producing original text data equal to an original of said compression text data such that word replacement information to be converted is retrieved which is contained in the phonetic text data produced by said phonetic text data producing means, and said retrieved word replacement information to be converted is replaced by word information to be converted which is stored in said phonetic character information storing means in correspondence with said phonetic character information contained in said word replacement information to be converted. - View Dependent Claims (13, 14, 15, 16, 17, 18)
-
-
19. A data decompressing apparatus comprising:
-
phonetic character information storing means for storing phonetic character information equal to information expressed with phonograms in 1-byte code characters in which the number of kinds of characters is less than the number of kinds of characters in decompressed text data and equivalent to word information to be converted being constructed of one or plural character information; decompressing means for decompressing means for decompressing compression text data to output phonetic text data; and original text data producing means for producing original text data equal to an original of said compression text data such that word replacement information to be converted is retrieved which is contained in the phonetic text data produced by said decompression means, and said retrieved word replacement information to be converted is replaced by word information to be converted which is stored in said phonetic character information storing means in correspondence with said phonetic character information contained in said word replacement information to be converted.
-
-
20. A data compressing method for compressing original text data containing character information in which a single character is expressed by plural-byte information, comprising:
-
a phonetic text data producing step for producing phonetic text data equal to such data that each of the character information contained in the plural-byte coded original text data to be compressed is replaced by phonetic character information expressed with phonograms in 1-byte code characters in which the number of kinds of characters is less than the number of kinds of characters in the original text data; and a compressing step for compressing the phonetic text data produced by said phonetic text data producing step.
-
-
21. A data compressing method form compressing original text data containing character information in which a single character is expressed by plural-byte information, comprising:
-
a retrieving/reading step for retrieving word information to be converted from the plural-byte coded original text data to be compressed by using a dictionary into which phonetic character information is stored, and said phonetic character information being equal to information expressed with phonograms in 1-byte code characters in which the number of kinds of characters is less than the number of kinds of characters in the original text data and equivalent to word information to be converted which is constituted by one or a plurality of character information, and also for reading phonetic character information corresponding to the retrieved word information to be converted from said dictionary; a phonetic text data producing step for producing phonetic text data by replacing the word information to be converted within said plural-byte coded original text data retrieved by said retrieving/reading step by word replacement information to be converted, which contains the phonetic character information read by said retrieving/reading step; an intermediate code table forming step for forming an intermediate code table used to adapt an intermediate code to each of information elements utilized in the phonetic text data produced by said phonetic text data producing step; an intermediate code text data producing step for producing intermediate code text data by converting the respective information elements for constructing said phonetic text data into the corresponding intermediate coded by using the intermediate code table produced by said intermediate code producing step; and a compressing step for compressing the intermediate code text data produced by said intermediate code text data producing step. - View Dependent Claims (22, 23, 24, 25)
-
-
26. A data compressing method for compressing original text data containing character information in which a single character is expressed in plural-byte information, comprising:
-
a retrieving/reading step for retrieving word information to be converted from the plural-byte coded original text data to be compressed by using a dictionary into which phonetic character information is stored, and said phonetic character information being equal to information expressed with phonograms in 1-byte code characters in which the number of kinds of characters is less than the number of kinds of characters in the original text data and equivalent to word information to be converted which is constituted by one or a plurality of character information;
said word information being stored in said dictionary; and
also for reading phonetic character information corresponding to the retrieved word information to be converted from said dictionary;a phonetic text data producing step for producing phonetic text data by replacing the word information to be converted within said plural-byte coded original text data retrieved by said retrieving/reading step by word replacement information to be converted, which contains the phonetic character information read by said retrieving/reading step; and a compressing step for compressing the phonetic text data produced by said phonetic text data producing step.
-
-
27. A data decompressing method comprising:
-
a decompressing step for decompressing compression text data; and an original text data producing step for producing original text data equal to data for constituting an original of said compression text data by converting phonetic character information, restored by said decompressing step, expressed with phonograms in 1-byte code characters in which the number of kinds of characters is less than the number of kinds of characters in decompressed text data, into character information corresponding thereto.
-
-
28. A data decompressing method comprising:
-
a decompressing step for decompressing compression text data to output intermediate code text data; a phonetic text data producing step for producing phonetic text data by replacing each of intermediate codes contained in the intermediate code text data outputted by said decompressing step by information adopted to an intermediate code table related to said compression text data; and an original text data producing step for producing original text data equal to an original of said compression text data such that word replacement information to be converted is retrieved which is contained in the phonetic text data produced by said phonetic text data producing step, while using a dictionary for storing phonetic character information equal to information expressed with phonograms in 1-byte code characters in which the number of kinds of characters is less than the number of kinds of characters in decompressed text data and equivalent to word information to be converted which is constituted by one or a plurality of information, and said retrieved word replacement information to be converted is replaced by word information to be converted which corresponds to said phonetic character information to be converted. - View Dependent Claims (29, 30, 31, 32)
-
-
33. A data decompressing method comprising:
-
a decompressing step for decompressing compression text data to output phonetic text data; and an original text data producing step for producing original text data equal to an original of said compression text data such that word replacement information to be converted is retrieved which is contained in the phonetic text data produced by said decompressing step, while using a dictionary for storing phonetic character information equal to information expressed with phonograms in 1-byte code characters in which the number of kinds of characters is less than the number of kinds of characters in decompressed text data and equivalent to word information to be converted which is constituted by one or a plurality of character information, and said retrieved word replacement information to be converted is replaced by word information to be converted which corresponds to said phonetic character information to be converted.
-
-
34. A program recording medium for recording there into a program capable of causing a computer to function as:
-
phonetic text data producing means for producing phonetic text data equal to such data as each of the character information contained in the plural-byte coded original text data to be compressed is replaced by phonetic character information expressed with phonograms in 1-byte code characters in which the number of kinds of characters in less than the number of kinds of characters in the original text data; and compressing means for compressing the phonetic text data produced by said phonetic text data producing means.
-
-
35. A program recording medium for recording there into a program capable of causing a computer to function as:
-
decompressing means for decompressing compression text data; and original text data producing means for producing original text data equal to data for constituting an original of said compression text data by converting phonetic character information as expressed with phonograms in 1-byte code characters in which the number of kinds of character is less than the number of kinds of characters in decompressed text data, as restored by said decompressing means into character information corresponding thereto in which a single character is expressed by plural-byte code character.
-
-
36. A data compressing apparatus for compressing original document data containing character information in which a single character is expressed by plural-byte code character comprising:
-
data converting means for converting the original document data expressed in plural-bytes code characters into document data expressed in 1-bytes code characters which designate the pronunciation of each word in the original document data; and compressing means for compressing the document data expressed in 1-bytes code characters.
-
Specification