Textual data storage system and method

US 6,898,605 B2
Filed: 09/12/2001
Issued: 05/24/2005
Est. Priority Date: 09/11/2000
Status: Expired due to Fees

First Claim

Patent Images

1. A hand-held electronic device for use in accessing and displaying data that is stored in compressed form, wherein when uncompressed, the data includes a series of words, and wherein each of the words is sized according to a multiple of a common unit of memory storage, the hand-held electronic device comprising:

a display for displaying information;

a processor;

a memory;

tokenized data stored in the memory, wherein the tokenized data comprises word and phrase tokens, wherein each of the word tokens represents a unique word in the data, wherein each of the word tokens is sized according to the common unit of memory storage regardless of the size of the unique word, wherein each of the phrase tokens represent a unique sequence of the word tokens in the tokenized data, wherein the phrase tokens are associated to the unique sequence in response to locating at least one repeated unique sequence of word tokens in the tokenized data and wherein each of the phrase tokens is sized according to a given multiple of the common unit of memory storage;

a word dictionary table-stored in the memory, wherein the one word dictionary comprises the word tokens and their corresponding unique words; and

a phrase dictionary stored in the memory, wherein the phrase dictionary table comprises the phrase tokens and their corresponding word tokens;

wherein a data access routine stored in the memory and executable by the processor is operable to receive an input, and responsive to the input, display a portion of the data by decompressing the tokenized data using the word and phrase dictionaries.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A hand-held electronic device for use in accessing and displaying data that includes a series of words. The hand-held electronic device includes a display, processor, and memory. Stored in the memory are tokenized data, and word and phrase dictionaries. The tokenized data comprises word and phrase tokens. Each word token represents a unique word in the data. Each phrase token represents a unique sequence of the word tokens and is associated to the unique sequence in response to locating repeated unique sequences in the tokenized data. The word dictionary comprises the word tokens and their corresponding unique words, and the phrase dictionary comprises the phrase tokens and their corresponding word tokens. A data access routine stored in the memory and executable by the processor is operable to display a portion of the data by decompressing the tokenized data using the word and phrase dictionaries.

83 Citations

View as Search Results

42 Claims

1. A hand-held electronic device for use in accessing and displaying data that is stored in compressed form, wherein when uncompressed, the data includes a series of words, and wherein each of the words is sized according to a multiple of a common unit of memory storage, the hand-held electronic device comprising:
- a display for displaying information;
  
  a processor;
  
  a memory;
  
  tokenized data stored in the memory, wherein the tokenized data comprises word and phrase tokens, wherein each of the word tokens represents a unique word in the data, wherein each of the word tokens is sized according to the common unit of memory storage regardless of the size of the unique word, wherein each of the phrase tokens represent a unique sequence of the word tokens in the tokenized data, wherein the phrase tokens are associated to the unique sequence in response to locating at least one repeated unique sequence of word tokens in the tokenized data and wherein each of the phrase tokens is sized according to a given multiple of the common unit of memory storage;
  
  a word dictionary table-stored in the memory, wherein the one word dictionary comprises the word tokens and their corresponding unique words; and
  
  a phrase dictionary stored in the memory, wherein the phrase dictionary table comprises the phrase tokens and their corresponding word tokens;
  
  wherein a data access routine stored in the memory and executable by the processor is operable to receive an input, and responsive to the input, display a portion of the data by decompressing the tokenized data using the word and phrase dictionaries.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The hand-held electronic device of claim 1, wherein the memory comprises at least one removable data module.
  - 3. The hand-held electronic device of claim 1, wherein each sequence of word tokens occurs at least twice within data.
  - 4. The hand-held electronic device of claim 1, wherein each phrase comprises at least 3 word tokens contained within the tokenized data.
  - 5. The hand-held electronic device of claim 1, wherein the portion of the data is displayed in uncompressed form.
  - 6. The hand-held electronic device of claim 1, wherein the data comprises automotive repair and service information.
  - 7. The hand-held electronic device of claim 1, wherein the display is a touchscreen display capable of accepting the input.
  - 8. The hand-held electronic device of claim 1, further comprising a keypad for accepting the input.

9. A machine-readable storage medium containing a data structure for housing data in a compressed form, wherein when uncompressed, the data includes a series of words, and wherein each of the words is sized according to a multiple of a common unit of memory storage, the data structure comprising:
- tokenized data stored in the memory, wherein the tokenized data comprises word and phrase tokens, wherein each of the word tokens represents a unique word in the data, wherein each of the word tokens is sized according to the common unit of memory storage regardless of the size of the unique word, wherein each of the phrase tokens represent a unique sequence of the word tokens in the tokenized data, wherein the phrase tokens are associated to the unique sequence in response to locating at least one repeated unique sequence of word tokens in the tokenized data, and wherein each of the phrase tokens is sized according to a given multiple of the common unit of memory storage;
  
  a word dictionary, wherein the word dictionary comprises The word tokens and their corresponding a unique words list; and
  
  a phrase dictionary, wherein the phrase dictionary comprises the phrase tokens and their corresponding word tokens.
- View Dependent Claims (10, 11, 12)
- - 10. The machine-readable storage medium of claim 9, wherein each phrase occurs at least twice within the data.
  - 11. The machine-readable storage medium of claim 9, wherein each phrase comprises at least 3 word tokens.
  - 12. The machine-readable storage medium of claim 9, wherein the data comprises automotive repair and service information.

13. A system for accessing and displaying data that is stored in compressed form, wherein when uncompressed, the data includes a series of words, and wherein each of the words is sized according to a multiple of a common unit of memory storage, the system comprising:
- a display for displaying information;
  
  a processor;
  
  a memory;
  
  tokenized data stored in the memory, wherein the tokenized data comprises word and phrase tokens, wherein each of the word tokens represents a unique word in the data, wherein each of the word tokens is sized according to the common unit of memory storage regardless of the size of the unique word, wherein each of the phrase tokens represent a unique sequence of the word tokens in the tokenized data, wherein the phrase tokens are associated to the unique sequence in response to locating at least one repeated unique sequence of word tokens in the tokenized data, and wherein each of the phrase tokens is sized according to a given multiple of the common unit of memory storage;
  
  a word dictionary stored in the memory, wherein the word dictionary comprises the word tokens and their corresponding unique words; and
  
  a phrase dictionary stored in the memory, wherein the phrase dictionary comprises the phrase tokens and their corresponding word tokens;
  
  wherein a data access routine stored in the memory and executable by the processor is operable to receive an input, and in responsive to the input, displaying a portion of the data by decompressing the tokenized data using the word and phrase dictionaries.
- View Dependent Claims (14, 15, 16, 17, 18, 19)
- - 14. The system of claim 13, wherein the memory comprises at least one removable data module.
  - 15. The system of claim 13, wherein each sequence of word tokens is repeated at least twice within the tokenized data.
  - 16. The system of claim 13, wherein each sequence of word tokens comprises at least 3 word tokens.
  - 17. The system of claim 13, wherein the portion of the data is displayed in uncompressed form.
  - 18. The system of claim 13, wherein the textual information data comprises automotive technical information.
  - 19. The system of claim 13, further comprising a keypad for accepting the input.

20. A method for storing in memory data in compressed form, wherein the data includes a series of words, and wherein each word of the series of words is sized according to a multiple of a common unit of memory storage, the method comprising:
- (a) associating a word token to each unique word of the data, wherein each word token is sized according to the common unit of memory storage regardless of the size of the unique word;
  
  (b) storing in the memory a word dictionary, wherein the word dictionary comprises each unique word and its associated word token;
  
  (c) converting each of the series of words in the data into a series of word tokens so as to produce tokenized data, wherein each of the series of word tokens corresponds to one of the word tokens in the word dictionary;
  
  (d) associating a phrase token to each repeated phrase in the tokenized data;
  
  wherein each of the repeated phrases comprises a sequence of the word tokens in the tokenized data, and wherein each phrase token is sized according to a given multiple of the common unit of memory storage;
  
  (e) storing in the memory a phrase dictionary, wherein the phrase dictionary comprises each repeated phrase and its associated phrase token;
  
  (f) converting each repeated phrase of the tokenized data into its associated phrase token; and
  
  (g) storing in memory the tokenized data, whereby the tokenized data comprises less common units of memory storage than the series of words when (i) at least one of the words is sized larger than its associated word token, and (ii) when the tokenized data comprises at least one repeated phrase.
- View Dependent Claims (21, 22, 23, 24, 25, 26, 27, 28, 29, 30)
- - 21. The method of claim 20, wherein the memory comprises a removable memory.
  - 22. The method of claim 20, wherein the data is textual information.
  - 23. The method of claim 22, wherein the textual information comprises automotive repair and servicing information.
  - 24. The method of claim 20, wherein the common unit of memory storage comprises two (2) bytes.
  - 25. The method of claim 20, wherein the textual information comprises a plurality of words is formatted in accordance with the American Standard Code for Information Interchange (ASCII), wherein each character of the ASCII formatted words is one (1) byte, wherein the common unit of memory storage comprises two (2) bytes, and wherein when any word is greater than three characters, then the tokenized data comprises less common units of memory storage than the series of words.
  - 26. The method of claim 25, wherein the plurality of words are formatted in accordance with only one case of the American Standard Code for Information Interchange.
  - 27. The method of claim 20, further comprising searching the data for unique words, wherein when a unique word is found, carrying out the steps of associating the word token to the unique word, and storing in the word dictionary the unique word and the associated word token.
  - 28. The method of claim 20, further comprising searching the tokenized data for the repeated phrases, wherein when a repeated phrase is found in the tokenized data, carrying out the steps of associating the phrase token to the repeated phrase, and storing in the phrase dictionary the repeated phrase and the associated word tokens.
  - 29. The method of claim 20, wherein the memory has a given number of available common units of memory storage, wherein the data is larger than the given number of available common units of memory storage, and wherein after carrying out steps (a)-(g) the tokenized data, the word dictionary and phrase dictionary fit within the given number of available common units of memory storage.
  - 30. The method of claim 20, wherein the given multiple of the common unit of memory storage for the phase token comprises a number selected from group of numbers consisting of a fractional number, a non-fraction number, and a combination of a fractional and non-fractional numbers.

31. A method for storing in memory data in compressed form, wherein the data includes a series of words, and wherein each word of the series of words is sized according to a multiple of a common unit of memory storage, the method comprising:
- (a) associating a word token to each unique word of the data, wherein each word token is sized according to the common unit of memory storage regardless of the size of the unique word;
  
  (b) storing in the memory a word dictionary, wherein the word dictionary comprises each unique word and its associated word token;
  
  c) converting each of the series of words in the data into a series of word tokens so as to produce tokenized data, wherein each of the series of word tokens corresponds to one of the word tokens in the word dictionary;
  
  (d) determining a compression-efficient-phrase length for repeated phrases in the tokenized data, wherein the compression-efficient-phrase length allows for efficient compression of the tokenized data, and wherein each of the repeated phrases comprises a sequence of the word tokens in the tokenized data;
  
  (e) associating a phrase token to each repeated phrase having the compression-efficient-phrase length;
  
  wherein each phrase token is sized according to a given multiple of the common unit of memory storage;
  
  (f) storing in the memory a phrase dictionary, wherein the phrase dictionary comprises (i) each repeated phrase having the compression-efficient-phrase length and (ii) the phrase token associated with each repeated phrase having the compression-efficient-phrase length;
  
  (g) converting each repeated phrase of the tokenized data having the compression-efficient-phrase length into its associated phrase token; and
  
  (h) storing in memory the tokenized data, whereby the tokenized data comprises less common units of memory storage than the series of words when (i) at least one of the words is sized larger than its associated word token, and (ii) when the tokenized data comprises at least one repeated phrase.
- View Dependent Claims (32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42)
- - 32. The method of claim 31, wherein the data is textual information.
  - 33. The method of claim 32, wherein the textual information comprises automotive repair and servicing information.
  - 34. The method of claim 31, further comprising searching the data for unique words, wherein when a unique word is found, carrying out the steps of associating the word token to the unique word, and storing in the word dictionary the unique word and its associated word token.
  - 35. The method of claim 31, further comprising searching the tokenized data for the repeated phrases having the compression-efficient-phrase length, wherein when a repeated phrase having the compression-efficient-phrase length is found in the tokenized data, carrying out the steps of associating the phrase token to the repeated phrase, and storing in the phrase dictionary the repeated phrase and its associated word tokens.
  - 36. The method of claim 31, wherein the memory has a given number of available common units of memory storage, wherein the data is larger than the given number of available common units of memory storage, and wherein after carrying out steps (a)-(h) the tokenized data, the word dictionary and phrase dictionary fit within the given number of available common units of memory storage.
  - 37. The method of claim 31, wherein the step of determining a compression-efficient-phrase length comprises searching the tokenized data for repeated phrases having a phrase length that provides maximum compression of the tokenized data.
  - 38. The method of claim 31, wherein the step of determining a compression-efficient-phrase length comprises:
    - determining an acceptable access time for decompressing the tokenized data; and
      
      determining a phrase length that provides a maximum compression of the tokenized data for the desired access time.
  - 39. The method of claim 38, wherein the acceptable access time is based on memory-access and display times of processing architecture that carries out decompression of the tokenized data.
  - 40. The method of claim 38, further comprising searching the tokenized data for the repeated phrases having the compression-efficient-phrase length, wherein when a repeated phrase having the compression-efficient-phrase length is found in the tokenized data, carrying out the steps of associating the phrase token to the repeated phrase, and storing in the phrase dictionary the repeated phrase and the associated word tokens.
  - 41. The method of claim 31, wherein the step of determining a compression-efficient-phrase length comprises:
    - searching the tokenized data for repeated phrases;
      
      determining phrase lengths for each of the discovered repeated phrases; and
      
      determining the phrase lengths that allow for maximum compression of the tokenized data.
  - 42. The method of claim 31, wherein the given multiple of the common unit of memory storage for the phase token comprises a number selected from group of numbers consisting of a fractional number, a non-fraction number, and a combination of a fractional and non-fractional numbers. user'"'"'s input.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Snap-On Incorporated
Original Assignee
Snap-On Incorporated
Inventors
Constantino, David
Primary Examiner(s)
Metjahic, Safet
Assistant Examiner(s)
Nguyen, Merilyn P

Application Number

US09/951,101
Publication Number

US 20020174112A1
Time in Patent Office

1,350 Days
Field of Search

707/1, 707/102, 707/3, 707/6, 707/101, 704/2, 704/257, 704/8, 704/3, 704/7, 704/6, 704/9
US Class Current

726/9
CPC Class Codes

G06F 40/242   Dictionaries

G11B 20/00007   Time or data compression or...

H03M 7/3084   using adaptive string match...

H03M 7/3088   employing the use of a dict...

Y10S 707/99933   Query processing, i.e. sear...

Y10S 707/99942   Manipulating data structure...

Y10S 707/99943   Generating database or data...

Textual data storage system and method

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

83 Citations

42 Claims

Specification

Solutions

Use Cases

Quick Links

Textual data storage system and method

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

83 Citations

42 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links