Adaptive data compression method and apparatus
First Claim
1. A dynamic data compression method, comprising the steps(1) providing an encoding table having a plurality of sets of digital codes of varying lengths associated with characters of an alphabet, some of the digital codes being shorter than others, the digital codes being of the same length in a given set but of a different length between sets, said encoding table comprising an initial ordered array of the characters in the alphabet and their corresponding digital codes, with a set of shorter codes toward the beginning of the array and a set of longer codes toward the end of the array;
- (2) presenting for encoding an item of data represented by one of the characters of the alphabet;
(3) in response to the presented item of data, selecting a code from one of the sets in the encoding table which corresponds to the particular character of the alphabet represented by the presented item of data;
(4) providing the selected code as an output; and
(5) periodically adjusting the corresponding relationship of the codes, the characters of the alphabet, and the sizes of the sets of digital codes in the encoding table as a function of the frequency of occurrence of characters of the alphabet over a plurality of characters presented for encoding,whereby as the frequency of occurrence of characters over a plurality of characters presented for encoding changes, the more frequently occurring characters become associated with shorter codes in the encoding table.
6 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for compressing data, particularly useful in a modem. The preferred method is implemented in a microprocessor within a modem, and dynamically adapts to changing data statistics. Parallel encoding and decoding tables are provided at the encoder and the decoder, and are updated for each character processed. Each table has a plurality of digital compression codes associated with characters of an alphabet. In response to an item of data presented for encoding, a compression code which corresponds to the character presented for encoding is selected using the encoding table. The selected compression code is provided as an output. Periodically, the association between the codes and the characters of the alphabet in the table is adjusted as a function of the frequency of occurrence of characters of the alphabet, over a plurality of characters. As the frequency of occurrence of characters presented for encoding changes, the more frequently occurring characters become associated with the shorter codes in the encoding table. By performing the same steps in the decoding table, the compression codes are used at the decoder to select a character of the alphabet, which is provided as a decoded output.
182 Citations
64 Claims
-
1. A dynamic data compression method, comprising the steps
(1) providing an encoding table having a plurality of sets of digital codes of varying lengths associated with characters of an alphabet, some of the digital codes being shorter than others, the digital codes being of the same length in a given set but of a different length between sets, said encoding table comprising an initial ordered array of the characters in the alphabet and their corresponding digital codes, with a set of shorter codes toward the beginning of the array and a set of longer codes toward the end of the array; -
(2) presenting for encoding an item of data represented by one of the characters of the alphabet; (3) in response to the presented item of data, selecting a code from one of the sets in the encoding table which corresponds to the particular character of the alphabet represented by the presented item of data; (4) providing the selected code as an output; and (5) periodically adjusting the corresponding relationship of the codes, the characters of the alphabet, and the sizes of the sets of digital codes in the encoding table as a function of the frequency of occurrence of characters of the alphabet over a plurality of characters presented for encoding, whereby as the frequency of occurrence of characters over a plurality of characters presented for encoding changes, the more frequently occurring characters become associated with shorter codes in the encoding table. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17)
-
-
18. A data decompression method, comprising the steps of:
-
(1) providing a decoding table having a plurality of sets of digital codes associated with the characters of an alphabet, the digital codes being of the same length in a given set but of a different length between sets, some of the digital codes being shorter than others, said decoding table comprising an initial ordered array of the characters in the alphabet and their corresponding digital codes, with a set of shorter codes toward the beginning of the array and a set of longer codes toward the end of the array; (2) presenting for decoding an item of coded data represented by one of the digital codes; (3) in response to the presented item of coded data, selecting a character from one of the sets in the decoding table which corresponds to the particular digital code represented by the presented item of coded data; (4) providing the selected character as a decoded output; and (5) periodically adjusting the corresponding relationship of the codes, the characters of the alphabet, and the sizes of the sets of digital codes in the decoding table as a function of the frequency of occurrence of characters of the alphabet over a plurality of characters resultant from decoding, whereby as the frequency of occurrence of characters over a plurality of characters resultant from decoding changes, the more frequently occurring characters become associated with shorter codes in the decoding table.
-
-
19. A dynamic data compression apparatus, comprising:
-
memory means for storing an encoding table comprising a plurality of sets of digital codes associated with characters of an alphabet, some of the digital codes being shorter than others, the digital codes being of the same length in a given set but of a different length between sets, said encoding table comprising an initial ordered array of the characters in the alphabet and their corresponding digital codes, with a set of shorter codes toward a beginning of the array and a set of longer codes toward the end of the array; means for receiving for encoding an item of data represented by one of the characters of the alphabet; means responsive to the presented item of data for selecting a code from one of the sets in the encoding table in said memory means which corresponds to the particular character of the alphabet represented by the presented item of data; output means for providing the selected code as an output; and code/character relationship adjusting means for periodically adjusting the corresponding relationship of the codes, the characters of the alphabet, and the sizes of the sets of digital codes in the encoding table in said memory means as a function of the frequency of occurrence of characters of the alphabet over a plurality of characters presented for encoding, whereby as the frequency of occurrence of characters over a plurality of characters presented for encoding changes, the more frequently occurring characters become associated with shorter codes in the encoding table. - View Dependent Claims (20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37)
-
-
38. A data decompression apparatus, comprising:
-
memory means for storing a decoding table having a plurality of sets of digital codes associated with the characters of an alphabet, some of the digital codes being shorter than others, the digital codes being of the same length in a given set but of a different length between sets, said decoding table comprising an initial ordered array of the characters in the alphabet and their corresponding digital codes, with a set of shorter codes toward the beginning of the array and a set of longer codes toward the end of the array; means for receiving for decoding an item of coded data represented by one of the digital codes; means responsive to the presented item of coded data for selecting a character from one of the sets in the decoding table in said memory means which corresponds to the particular digital code represented by the presented item of coded data; means for providing the selected character as a decoded output; and means for periodically adjusting the corresponding relationship of the codes, the characters of the alphabet, and the sizes of the sets of digital codes in the decoding table in said memory means as a function of the frequency of occurrence of characters of the alphabet over a plurality of characters resultant from decoding, whereby as the frequency of occurrence of characters over a plurality of characters resultant from decoding changes, the more frequencyl occurring characters become associated with shorter codes in the decoding table.
-
-
39. A dynamically variable data compression method, particularly useful in a series/sequential data transmission application such as a modem, comprising the steps of:
-
(1) providing an alphabet of n characters to be employed in representing information, ranging from 0 to n; (2) arranging the n characters in an initial ordered array; (3) providing a set of z bit codes to be used for encoding said n characters, some of said codes having greater bit lengths than others, with shorter codes being associated with positions in the array toward the beginning and longer codes being associated with positions in the array toward the end; (4) providing a sequential input information stream represented by the characters of said alphabet to be encoded with said codes; (5) for each item x of said sequential input information stream taking the following steps; (i) providing a z bit code as an output chosen from said set of codes as a function of the relative position in the array of the character corresponding to the item x; (ii) incrementing a character count CC associated with the character of said alphabet which represents said particular item x of said sequential input information stream; (iii) comparing the character count CC with the character count associated with each character within a predetermined range toward the beginning of the array; (iv) when the character count CC is found upon comparison to be greater than or equal to the character count for a character toward the beginning of the array within said predetermined range, exchanging the relative positions in the array of the character corresponding to the item x and the character toward the beginning of the array within said predetermined range having the equal or lesser character count, whereby as the frequency of occurrence of characters in the input information stream changes, the more frequently appearing characters tend to migrate toward the beginning of the array and thereby become associated with shorter ones of said codes. - View Dependent Claims (40, 41, 42)
-
-
43. A dynamic data decompression method, particularly useful in a series/sequential data receiving application such as a modem, comprising the steps of:
-
(1) providing an alphabet of nd characters to be employed in representing information, ranging from 0 to nd ; (2) arranging the nd characters in an initial ordered array, each of the characters being associated with a z bit code used for decoding the characters, some of the codes having greater bit lengths than others, with shorter codes being associated with positions in the array toward the beginning and longer codes being associated with positions in the array toward the end; (3) receiving a sequential data stream encoded with the codes; (4) looking up a character in the alphabet corresponding to the code; (5) providing as an output a character of the alphabet corresponding to the code;
-
-
44. (6) maintaining the order of the array for the alphabet by taking the following steps for each item x of the sequential data stream:
-
(i) incrementing a character count CC associated with the character of the alphabet which represents said item x of the output character, (ii) comparing the character count CC with the character count associated with each character within a predetermined range toward the beginning the array; (iii) when the character count CC is found upon comparison to be greater than or equal to the character count for a character toward the begnning of the array within said predetermined range, exchanging the relative positions in the array of the character corresponding to the item x and the character toward the beginning of the array within said predetermined range having the equal or lesser character count, whereby as the frequency of occurrence of characters being output changes, the more frequently appearing corresponding characters in the alphabet tend to migrate toward the beginning of the decoding array. - View Dependent Claims (46, 47, 48, 49, 50, 51, 52)
-
-
45. A dynamically variable data compression method, particularly useful in a series/sequential data transmission application such as a modem, comprising the steps of:
-
(1) providing an alphabet of n characters to be employed in representing information, ranging from 0 to n; (2) arranging the n characters in an initial ordered array; (3) providing a first set of p-bit codes to be used for encoding a first predetermined number a of said n characters, a second set of q-bit codes to be used for encoding a second predetermined number b of said n characters, and a third set of r-bit codes to be used for encoding a third predetermined number of c of said n characters, where a+b+c=n; (4) providing a sequential input information stream represented by the characters of said alphabet to be encoded with said p-, said q-, or said r-bit codes; (5) for each item x of said sequential input information stream, taking the following steps; (i) providing a z bit code as an output chosen from said first or said second or said third set of codes as a function of the relative position in the array of the character corresponding to the item x, where z is p or q or r; (ii) incrementing a character count CC associated with the character of said alphabet which represents said particular item x of said sequential input information stream; (iii) comparing the character count CC with the character count associated with each character within a predetermined range lower in the array, up to a total of d places lower in the array, where d is a predetermined number less than n; (iv) when the character count CC is found upon comparison to be greater than or equal to the character count for a character lower in the array within said predetermined range, exchanging the relative positions in the array of the character corresponding to the item x and the character lower in the array within said predetermined range having the equal or lesser character count; (6) after every k characters of said sequential input information stream, changing the "encoding level" represented by the relative values of the numbers a, b, and c by taking the following steps; (i) summing the character counts for consecutive groups of m characters to obtain a plurality of n/m group counts; (ii) comparing the group count for group (n/m)-i-1 to the character count for character j, where i is an integer index of group counts beginning at zero and j is an integer index into the array beginning at zero at the top of the array; (iii) if the group count for group (n/m)--1 is less than the character count for character j, then, incrementing i and j and going back to the step 6(ii); (iv) if the group count for group (n/m)-i-1 is greater than or equal to the character count for character j, then establishing an encoding level of a=j+1 by assigning j+1 character codes having p bits, c=(n/m)×
(i+1) character codes having r bits, and b=n-(j+1)-((n/m)×
(i+1)) character codes having q bits.
-
-
53. A dynamically variable data decompression method, particularly useful in a series/sequential data communications device such as a modem, comprising the steps of:
-
(1) providing an alphabet of nd characters; (2) arranging the nd characters of the alphabet in an initial ordered array; (3) providing a first set of pd -bit codes to be used for decoding a first predetermined number ad of said nd characters, a second set of qd -bit codes to be used for decoding a second predetermined number bd of said nd characters, and a third set of rd -bit codes to be used for decoding a third predetermined number cd of said nd characters, where pd <
qd <
rd, and where ad +bd +cd =nd ;(4) receiving a sequential data stream over a communications link encoded as a z bit code of p-, q-, and r-bit codes, where p, q, and r correspond to pd, qd, and rd ; (5) for each item x of said sequential data stream, taking the following steps; (i) finding a match between the z bit code representing the item x and one of said codes associated with said alphabet of nd characters; (ii) providing as a decompressed output a character of said alphabet of nd characters which corresponds to the code representing the item x; (iii) incrementing a character count CCd associated with the character of said alphabet which represents the output character corresponding to the particular item x of the sequential data stream; (iv) comparing the character count CCd with the character count associated with each character within a predetermined range toward the beginning of the array of said alphabet, up to a total of dd places toward the beginning of the array; and (v) when the character count CCd is found upon comparison to be greater than or equal to the character count for a character toward the beginning of the array of said alphabet within said predetermined range, exchanging the relative positions in the array of the character corresponding to the item x and the character toward the beginning of the array within said predetermined range having the equal or lesser character count; and (6) after every kd characters of said sequential data stream, changing the "decoding level" represented by the relative values of the numbers ad, bd, and cd by taking the following steps; (i) summing the character counts for consecutive groups of md characters to obtain a plurality of nd /md group counts; (ii) comparing the group count for group (nd /md)-i-1 to the character count for character j, where i is an integer index of group counts beginning at zero and j is an integer index into the array beginning at zero at the beginning of the array; (iii) if the group count for group (nd /md)-i-1 is less than the character count for character j, then, incrementing i and j and going back to the step (6)(ii); and (iv) if the group count for group (nd /md)-i-1 is greater than or equal to the character count for character j, then establishing a decoding level of ad =j+1 by assigning j+1 character codes having pd bits, cd =(nd /md)×
(i+1) character codes having rd bits, and bd =nd -(j+1)-((nd /md)×
(i+1)) character codes having qd bits.
-
-
54. A dynamically adjusting data compression method, comprising the steps of:
-
(1) providing an encoding table having a plurality of sets of digital codes associated with characters of an alphabet, the digital codes being of the same length in a given set but of a different length between sets, the table being arranged in an initial ordered array of increasing code lengths; (2) presenting for encoding an item of data represented by one of the characters of the alphabet; (3) in response to the presented item of data, selecting a code from one of the sets in the the encoding table which corresponds to the particular character of the alphabet represented by the presented item of data; (4) providing the selected code as an output; (5) in further response to the presented item of data, maintaining a statistical record of the probability of occurrence of the character representing the presented item of data; (6) in further response to the presented item of data, exchanging the position in the array of the character representing the presented item of data with a character having a lower probability of occurrence such that a character having a higher probability of occurrence will move to a position in the array more likely to be associated with a shorter code; and (7) periodically adjusting the sizes of the sets of digital codes a function of the probability of occurrence of the characters associated with the sets.
-
-
55. A dynamically adjusting data decompression method, comprising the steps of:
-
(1) providing a decoding table having a plurality of sets of digital codes associated with characters of the alphabet, the digital codes being of the same length in a given set but of a different length between sets, the table being arranged in an initial ordered array of increasing code lengths; (2) presenting for decoding an item of coded data represented by one of the digital codes; (3) in response to the presented item of coded data, selecting a character from one of the sets in the decoding table which corresponds to the particular digital code represented by the item of coded data; (4) providing the selected character as a decoded output; (5) in further response to the presented item of coded data, maintaining a statistical record of the probability of occurrence of the character resultant from decoding; (6) in further response to the presented item of coded data, exchanging the position in the decoding array of the decoded character with a character having a lower probability of occurrence such that a character having a higher probability of occurrence will move to a position in the array more likely to be associated with a shorter code; and (7) periodically adjusting the sizes of the sets as a function of the probability of occurrence of the characters associated with the sets. - View Dependent Claims (57, 59, 61)
-
-
56. A dynamic data compression method, comprising the steps of:
-
(1) providing an encoding table having a plurality of sets of digital codes associated with characters of an alphabet, the digital codes being of different lengths between sets; (2) presenting for encoding an item of data represented by one of the characters of the alphabet; (3) in response to the presented item of data, selecting a code from one of the sets in the encoding table which corresponds to the particular character of the alphabet represented by the presented item of data; (4) providing the selected code as an output; and (5) periodically adjusting the sizes of the sets of digital codes in the encoding table as a function of the frequency of occurrence of characters of the alphabet over a plurality of characters presented for encoding, whereby the numbers of characters represented by different digital codes varies as a function of the frequency of occurrence of characters in the data presented for encoding. - View Dependent Claims (58, 60, 62, 64)
-
-
63. A dynamic data compression method, comprising the steps of:
-
(1) providing an encoding table having a plurality of digital codes of varying lengths associated with characters of an alphabet, said encoding table comprising an initial ordered array of the characters in the alphabet and their corresponding digital codes, with shorter codes toward the beginning of the array and longer codes toward the end of the array; (2) presenting for encoding an item of data represented by one of the characters of the alphabet; (3) in resoponse to the presented item of data, selecting a code from one of the sets in the encoding table which corresponds to the particular character of the alphabet represented by the presented item of data; (4) providing the selected code as an output; (5) in further response to the presented item of data, incrementing a character count corresponding to the frequency of occurrence of the particular character of the alphabet represented by the presented item of data; (6) in further response to the presented item of data, after incrementing the particular character'"'"'s character count, comparing the particular character'"'"'s character count to a character count for a preselected character in the array within a predetermined number of positions toward the beginning of the array; and (7) if the particular character'"'"'s character count is greater than the character count for the preselected character, exchanging the relative positions in the array of the character just encoded with the preselected character so as to associate the character having the larger character count with a position in the array more likely to be associated with a shorter code, whereby as the frequency of occurrence of characters over a plurality of characters presented for encoding changes, the more frequently occurring characters become associated with shorter codes in the encoding table.
-
Specification