Method and system for theme-based word sense ambiguity reduction
First Claim
1. A method for reducing word sense ambiguities in a sentence, based on thematic prediction, said method comprising the steps of:
- a) receiving an input sentence consisting of a sequence of part-of-speech tagged words;
b) creating a sequence of sense tagged words from said received sequence of part-of-speech tagged words, each of said sense tagged words having one or more senses, said senses further being theme tagged;
c) predicting a set of one or more probable themes associated with said created sequence of sense-tagged words;
d) weighting each of said one or more probable themes from said predicted set; and
e) reducing sense ambiguities by eliminating remotely probable senses or selecting highly probable senses of said sense tagged words based on said weighted set of one or more probable themes.
1 Assignment
0 Petitions
Accused Products
Abstract
Word sense ambiguity, for “thematic” words in a sentence, is achieved based on thematic prediction. The senses of “thematic” words are disambiguated in a sentence by determining and weighting possible themes for that sentence. Possible themes are determined for that sentence based on thematic information associated with the different senses of each word in the sentence. A highly deterministic thematic-based word sense disambiguation method is used to preprocess the sentence prior to further syntactic and semantic analysis, thereby enhancing accuracy and decreasing the demand for computational resources (memory and CPU) by reducing input ambiguities.
22 Citations
26 Claims
-
1. A method for reducing word sense ambiguities in a sentence, based on thematic prediction, said method comprising the steps of:
-
a) receiving an input sentence consisting of a sequence of part-of-speech tagged words; b) creating a sequence of sense tagged words from said received sequence of part-of-speech tagged words, each of said sense tagged words having one or more senses, said senses further being theme tagged; c) predicting a set of one or more probable themes associated with said created sequence of sense-tagged words; d) weighting each of said one or more probable themes from said predicted set; and e) reducing sense ambiguities by eliminating remotely probable senses or selecting highly probable senses of said sense tagged words based on said weighted set of one or more probable themes. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
-
-
11. A system for reducing word sense ambiguities in a sentence, based on thematic prediction, said system comprising:
-
a thematic predictor receiving an input sentence comprising a sequence of part-of-speech tagged words and outputting a sequence of sense tagged words and a set of one or more predicted themes associated with said sequence of sense tagged words, each of said sense tagged words having one or more senses; a thematic scorer weighting each of said set of one or more predicted themes; and a thematic word sense disambiguator reducing sense ambiguities by eliminating remotely probable senses or selecting highly probable senses of said sense tagged words based on said weighted set of one or more probable themes. - View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
-
-
21. An article of manufacture comprising a computer usable medium having computer readable code embodied therein which reduces word sense ambiguities in a sentence, based on thematic prediction, said medium comprising:
-
computer readable program code receiving an input sentence consisting of a sequence of part-of-speech tagged words; computer readable program code creating a sequence of sense tagged words from said received sequence of part-of-speech words, each of said sense tagged words having one or more senses, said senses further being theme tagged; computer readable program code predicting a set of one or more probable themes associated with said created sequence of sense-tagged words; computer readable program code weighting each of said predicted set of one or more probable themes; and computer readable program code reducing sense ambiguities by eliminating remotely probable senses or selecting highly probable senses of said sense tagged words based on said weighted set of one or more probable themes. - View Dependent Claims (22, 23)
-
-
24. A method for processing text of a sentence, based on thematic prediction, said method comprising the steps of:
-
a) receiving an input sentence consisting of a sequence of part-of-speech tagged words; b) creating a sequence of sense tagged words from said received sequence of part-of-speech tagged words, each of said senses further being theme tagged; c) predicting a set of one or more probable themes associated with said created sequence of sense-tagged words; d) weighting each of said one or more probable themes from said predicted set; and e) refraining from reducing sense ambiguity if more than one of said predicted set of probable themes have the same weighting and if said weighting is the highest one among the set of predicted themes, otherwise reducing sense ambiguities by eliminating remotely probable senses or selecting highly probably senses of said sense tagged words based on said weighted set of one or more probable themes.
-
-
25. A method for reducing word sense ambiguities in a sentence, based on thematic prediction, said method comprising the steps of:
-
a) receiving an input sentence consisting of a sequence of part-of-speech tagged words; b) creating a sequence of sense tagged words from said received sequence of part-of-speech tagged words, each of said senses further being theme tagged; c) predicting a set of one or more probable themes associated with said created sequence of sense-tagged words; d) weighting each of said one or more probable themes from said predicted set, and e) reducing sense ambiguities by eliminating remotely probable senses or selecting highly probably senses from said weighted set of one or more probable themes only if the number of words in said input sentence possessing a dominant theme is equal to or greater than ¼
the total number of words in said input sentence.
-
-
26. A method for reducing word sense ambiguities in a sentence, based on thematic prediction, said method comprising the steps of:
-
a) receiving an input sentence comprising part-of-speech tagged words; b) associating possible senses with at least some of said part-of-speech tagged words, thereby generating sense tagged words, each of said sense tagged words having one or more possible senses; c) associating theme tags with at least some of said of said sense tagged words; d) scoring possible themes for said sentence based on said theme tags; e) selecting a dominant theme for said sentence based upon said scoring; and f) reducing sense ambiguities based on the dominant theme, wherein reducing sense ambiguities based on the dominant theme comprises at least one of i) eliminating a possible sense of a given sense tagged word as being non-representative of the given sense tagged word based upon the dominant theme, and ii) selecting a possible sense of a given sense tagged word as being representative of the given sense tagged word based upon the dominant theme.
-
Specification