Method for automatically determining solid compound words
First Claim
1. A method for automatic determination whether a word type is a solid compound word, comprising the steps of:
- a) looking up a word type in an electronically stored list of known word types that comprises an indication for each known word type of whether or not it is a know solid compound word and an indication for each known solid compound word of its main division point;
b1) determining, when said word type is found in said electronically stored list of known word types, whether said word type is a known solid compound word or not in accordance with the indication in said electronically stored list of known word types;
b2) determining, when said word type is a known solid compound word, that said word type has the main division point according to said list of known solid compound words;
c) looking up, when said word type is not found in said electronically stored list of known word types, for each possible division of said word type into a prefix and a suffix, said prefix in an electronically stored list of known prefixes of solid compound words of a word class and said suffix in an electrically stored list of known suffixes of solid compound words of said word class;
d1) determining, when a prefix, associated with a division, is found in said electronically stored list of known prefixes of solid compound words of said word class and a suffix, associated with said division, is found in said electronically stored list of known suffixes of solid compound words or said word class, that said word type is a solid compound word of said word class;
d2) determining, when said word type is a solid compound word of said word class, that it has its main division point between said prefix and said suffix; and
e) repeating, when said word type has not been determined to be a solid compound word, steps c), d1) and d2) for each of a number of different word classes.
4 Assignments
0 Petitions
Accused Products
Abstract
A method for automatic determination whether a word type is a solid compound word is disclosed. In the method a word type is looked up in an electronically stored list of known word types. The list comprises an indication for each known word type of whether or not it is a known solid compound word. If the word type is found in the electronically stored list of known word types, it is determined whether the word type is a known solid compound word or not in accordance with the indication in the electronically stored list of known word types. If the word type is not found in the electronically stored list of known word types, the word type is divided into a prefix and a suffix. The prefix is looked up in an electronically stored list of known prefixes of solid compound words of a word class and the suffix is looked up in an electronically stored list of known suffixes of solid compound words of said word class. These look-ups are done for all possible divisions of the word type. If a prefix, associated with a division, is found in the electronically stored list of known prefixes of solid compound words of the word class and a suffix, associated with the division, is found in the electronically stored list of known suffixes of solid compound words of the word class, it is determined that the word type is a solid compound word of the word class. If the word type has not been determined to be a solid compound word, the look-up and determination is repeated for each of a number of different word classes.
37 Citations
11 Claims
-
1. A method for automatic determination whether a word type is a solid compound word, comprising the steps of:
-
a) looking up a word type in an electronically stored list of known word types that comprises an indication for each known word type of whether or not it is a know solid compound word and an indication for each known solid compound word of its main division point;
b1) determining, when said word type is found in said electronically stored list of known word types, whether said word type is a known solid compound word or not in accordance with the indication in said electronically stored list of known word types;
b2) determining, when said word type is a known solid compound word, that said word type has the main division point according to said list of known solid compound words;
c) looking up, when said word type is not found in said electronically stored list of known word types, for each possible division of said word type into a prefix and a suffix, said prefix in an electronically stored list of known prefixes of solid compound words of a word class and said suffix in an electrically stored list of known suffixes of solid compound words of said word class;
d1) determining, when a prefix, associated with a division, is found in said electronically stored list of known prefixes of solid compound words of said word class and a suffix, associated with said division, is found in said electronically stored list of known suffixes of solid compound words or said word class, that said word type is a solid compound word of said word class;
d2) determining, when said word type is a solid compound word of said word class, that it has its main division point between said prefix and said suffix; and
e) repeating, when said word type has not been determined to be a solid compound word, steps c), d1) and d2) for each of a number of different word classes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 9, 10, 11)
f) repeating, when said word type has been determined to be a solid compound word, steps a)-e) for said prefix of said word type;
g) repeating, when said word type has been determined to be a solid compound word, steps a)-e) for said suffix of said word type;
h) recursively repeating steps a)-e) for the prefix of a prefix that is a compound word until said prefix of a prefix is determined not to be a compound word;
i) recursively repeating steps a)-e) for the suffix of a prefix that is a compound word until said suffix of a prefix is determined not to be a compound word;
j) recursively repeating steps a)-e) for the prefix of a suffix that is a compound word until said prefix of a suffix is determined not to be a compound word; and
k) recursively repeating steps a)-e) for the suffix of a suffix that is a compound word until said suffix of a suffix is determined not to be a compound word.
-
-
3. The method according to claim 1, further comprising the step of:
1) updating, when said word type is determined to be a solid compound word of a given word class, said electronically stored list of known word types with said word type and with an indication that said word type is a known solid compound word.
-
4. The method according to claim 1, further comprising the step of:
m) updating, when said word type is a solid compound word of a given word class, said electronically stored list of known word types with said word type, with an indication that said word type is a known solid compound word, and with an indication that said word type has its main division point between said prefix and said suffix.
-
5. The method according to claim 1, wherein, the look-up in step d) is done sequentially starting with a division of said word type between the first character and the second character of said word type and ending with a division of said word type between the penultimate character and the ultimate character of said word type.
-
6. The method according to claim 1, wherein steps c) and d) are performed for word classes with more restricted combinatorial properties before they are preformed for word classes with less restrictive combinatorial properties.
-
7. The method according to claim 1, wherein the steps c) and d) are performed first for solid compound names using an electronically stored list of known prefixes of solid compound names and an electronically stored list of known suffixes of solid compound names, then for solid compound verbs using an electronically stored list of known prefixes of solid compound verbs and an electronically stored list of known suffixes of solid compound verbs, and finally for other solid compound words using an electronically stored list of known prefixes of other solid compound words and an electronically stored list of known suffixes of other solid compound words.
-
9. A computer readable medium having computer-executable instructions for a general-purpose computer to perform the steps recited in claim 1.
-
10. A computer program comprising computer-executable instructions for performing the steps recited in claim 1.
-
11. An apparatus comprising means for performing the steps recited in claim 1.
-
8. A method for automatic determination whether a word type is a solid compound word, comprising the steps of:
-
looking up a word type in an electronically stored list of known word types that comprises an indication for each know word type of whether it is a compound word or not and, if that is the case, a specification of its main division point;
determining, when said word type is found in said electronically stored list of known word types, whether said word type is a known solid compound word, an if that is the case, that its main division point is the one according to said electronically stored list of known word types;
looking up, when said word type is not found in said electronically stored list of known word types, for each possible division of said word type into a prefix and a suffix, said prefix in an electronically stored list of known prefixes of solid compound names and said suffix in an electronically stored list of known suffixes of solid compound names;
determining, when a prefix, associated with a division, is found in said electronically stored list of known prefixes of solid compound names and a suffix, associated with said division, is found in said electronically stored list of known suffixes of solid compound names, that said word type is a solid compound name and that its main division point is between said prefix and said suffix;
looking up, when said word type is not found in said electronically stored list of known word types and is not a solid compound name, for each possible division of said word type into a prefix and a suffix, said prefix in an electronically stored list of known prefixes of solid compound verbs and said suffix in an electronically stored list of known suffixes of solid compound verbs;
determining, when a prefix, associated with a division, is found in said electronically stored list of known prefixes of solid compound verbs and a suffix, associated with said division, is found in said electronically stored list of known suffixes of solid compound verbs, that said word type is a solid compound verb and that its main division point is between said prefix and said suffix;
looking up, when said word type is not found in said electronically stored list of known word types, is not a solid compound name, and is not a solid compound verb, for each possible division of said word type into a prefix and a suffix, said prefix in an electronically stored list of known prefixes of other solid compound words and said suffix in an electronically stored list of known suffixes of other solid compound words; and
determining, when a prefix, associated with a division, is found in said electronically stored list of known prefixes of other solid compound words and a suffix, associated with said division, is found in said electronically stored list of known suffixes of other solid compound words, that said word type is an other solid compound word and that its main division point is between said prefix and said suffix.
-
Specification