Compressed lexicon and method and apparatus for creating and accessing the lexicon
First Claim
1. A method of building a compressed lexicon, comprising:
- receiving a word list and word-dependent data associated with each word in the word list;
selecting a word from the word list;
generating an index entry identifying a location in a lexicon memory for holding the selected word;
encoding the selected word and its associated word-dependent data to obtain encoded words and associated encoded word-dependent data; and
writing the encoded word and its associated word-dependent data at the identified location in the lexicon memory.
2 Assignments
0 Petitions
Accused Products
Abstract
A compressed lexicon is built by receiving a word list, which includes word-dependent data associated with each word in the word list. A word is selected from the word list. A hash value is generated based on the selected word, and the hash value identifies an address in a hash table which, in turn, is written with a location in lexicon memory that is to hold the compressed form of the selected word, and the compressed word-dependent data associated with the selected word. The word is then encoded, or compressed, as is its associated word-dependent data. This information is written at the identified location in the lexicon memory.
-
Citations
31 Claims
-
1. A method of building a compressed lexicon, comprising:
-
receiving a word list and word-dependent data associated with each word in the word list;
selecting a word from the word list;
generating an index entry identifying a location in a lexicon memory for holding the selected word;
encoding the selected word and its associated word-dependent data to obtain encoded words and associated encoded word-dependent data; and
writing the encoded word and its associated word-dependent data at the identified location in the lexicon memory. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 20, 21, 22, 24, 25, 26, 28, 29, 30, 31)
-
-
12. A method of accessing word information related to a word stored in a compressed lexicon, comprising:
-
receiving the word;
accessing an index to obtain a word location in the compressed lexicon that contains information associated with the received word;
reading encoded word information from the word location; and
decoding the word information.
-
-
19. A compressed lexicon builder for building a compressed lexicon based on a word list containing a plurality of domains, the domains including words and word-dependent data associated with the words, the compressed lexicon builder comprising:
-
a plurality of domain encoders, one domain encoder being associated with each domain in the word list, the domain encoders being configured to compress the words and word-dependent data to obtain compressed words and compressed word-dependent data;
a hashing component configured to generate a hash value for each word in the word list;
a hash table generator, coupled to the hashing component, configured to determine a next available location in a lexicon memory and write, at an address in a hash table identified by the hash value, the next available location in the lexicon memory; and
a lexicon memory generator, coupled to the domain encoders and the hash table generator, configured to store in the lexicon memory the compressed words and compressed word-dependent data, each compressed word and its associated compressed word-dependent data being stored at the next available location in the lexicon memory written in the hash table at the hash table address associated with the compressed word.
-
-
23. A compressed lexicon accesser for accessing word-dependent data in a compressed lexicon based on a received word, the compressed lexicon accesser comprising:
-
a plurality of domain decoders, one domain decoder being associated with each domain in the compressed lexicon, the domain decoders being configured to decompress the words and word-dependent data;
a hashing component configured to generate a hash value for the received word;
a hash table accesser, coupled to the hashing component, configured to read from an address in a hash table identified by the hash value, a word location in a lexicon memory corresponding to a lexicon entry for the received word; and
a lexicon memory accesser, coupled to the domain decoders and the hash table accesser, configured to read from the word location in the lexicon memory compressed words and compressed word-dependent data and provide the compressed words and compressed word-dependent data to corresponding domain decoders.
-
-
27. A compressed lexicon having a data structure, comprising:
-
a word portion storing a compressed word;
a first word-dependent data portion storing a first type of compressed word-dependent data; and
a first header portion associated with the first word-dependent data portion storing a type indicator indicating the type of word-dependent data stored in the first word-dependent data portion, and a last field indicator indicating whether the first word-dependent data portion is a last word-dependent data portion associated with the compressed word.
-
Specification