Spelling assistance method for compound words
First Claim
1. In a computer system including input and output devices and a dictionary of terms each of which has a code indicating the way in which the term may be associated to form compound words, a process for isolating the unknown components of an input compound word which is misspelled or for which the dictionary of terms does not contain all of its components, comprising:
- identifying all dictionary terms which have a code indicating that they can be initial substrings of compound words and which are initial substrings of said input compound word;
for each said initial substring, identifying in turn the remaining substring of the input compound word by retaining the portion of the compound word succeeding the initial substring;
identifying the unknown components of the input compound word to be the set of said remaining substrings truncated at a point where any terminal substring starts.
1 Assignment
0 Petitions
Accused Products
Abstract
Correctly spelled compound words are provided as candidates to replace a misspelled compound word in many natural languages such as Dutch, Danish, German, Icelandic, Norwegian, Swedish, Swiss German, etc. The basic technique consists of looking up words in a dictionary by the association of component flags with each possible constituent word within the misspelled compound word as well as with the possible replacement candidates for each letter string between these possible constituent words, and by the application of powerful tree-scanning techniques that isolate the possible components of a compound word and determine their correctness in isolation and association of each other.
49 Citations
4 Claims
-
1. In a computer system including input and output devices and a dictionary of terms each of which has a code indicating the way in which the term may be associated to form compound words, a process for isolating the unknown components of an input compound word which is misspelled or for which the dictionary of terms does not contain all of its components, comprising:
-
identifying all dictionary terms which have a code indicating that they can be initial substrings of compound words and which are initial substrings of said input compound word; for each said initial substring, identifying in turn the remaining substring of the input compound word by retaining the portion of the compound word succeeding the initial substring; identifying the unknown components of the input compound word to be the set of said remaining substrings truncated at a point where any terminal substring starts. - View Dependent Claims (2, 3)
-
-
4. A computer method for identifying an unknown component in a compound word and providing spelling candidates within the context of the compound word, comprising the steps of:
-
inputting to a computer a compound word from a stream of text; matching front components of the input compound word to entries in a dictionary data base; terminating the analysis if the complete compound word is matched with an entry in said dictionary; identifying a leading component of said compound word as an unknown component if said leading component thereof was not matched with entries in said dictionary; matching successive groups of contiguous characters in said compound word with entries in said dictionary until an ending component is successfully matched with the entries in said dictionary; generating a plurality of alternate compound word candidates by combining front component entries from said dictionary which are similar to said front component of said input compound word, with end component entries from said dictionary which are similar to said end component of said input compound word; checking each of said plurality of compound word candidates for appropriate compound flags, elision codes, and binding characters stored in said dictionary in conjunction with said component word entries therein; displaying a subplurality of said plurality of compound word candidates as appropriate compound words.
-
Specification