Bootstrapping sense characterizations of occurrences of polysemous words in dictionaries
First Claim
1. A method in a computer system for, in a representation of one or more dictionaries comprising a plurality of text segments, characterizing the sense of an occurrence of a polysemous word, the method comprising the steps of:
- selecting a plurality of dictionary text segments each containing a first word;
identifying among the selected dictionary text segments a first occurrence of a second word, the first occurrence of the second word having no word sense characterization;
identifying among the selected dictionary text segments a second occurrence of the second word, the second occurrence of the second word having a word sense characterization; and
attributing to the first occurrence of the second word the word sense characterization of the second occurrence of the second word.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention is directed to characterizing the sense of an occurrence of a polysemous word in a representation of a dictionary. In a preferred embodiment, the representation of the dictionary is made up of a plurality of text segments containing word occurrences having a word sense characterization and word occurrences not having a word sense characterization. The embodiment first selects a plurality of the dictionary text segments that each contain a first word. The embodiment then identifies from among the selected text segments a first and a second occurrence of a second word. The identified second occurrence of the second word has a word sense characterization. The embodiment then attributes to the first occurrence of the second word sense characterization of the second occurrence of the second word.
-
Citations
23 Claims
-
1. A method in a computer system for, in a representation of one or more dictionaries comprising a plurality of text segments, characterizing the sense of an occurrence of a polysemous word, the method comprising the steps of:
-
selecting a plurality of dictionary text segments each containing a first word; identifying among the selected dictionary text segments a first occurrence of a second word, the first occurrence of the second word having no word sense characterization; identifying among the selected dictionary text segments a second occurrence of the second word, the second occurrence of the second word having a word sense characterization; and attributing to the first occurrence of the second word the word sense characterization of the second occurrence of the second word. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A computer-readable medium whose contents cause a computer system to, in a representation of one or more dictionaries comprising a plurality of text segments, characterize the sense of an occurrence of a polysemous word by:
-
selecting a plurality of dictionary text segments each containing a first word; identifying among the selected dictionary text segments a first occurrence of a second word, the first occurrence of the second word having no word sense characterization; identifying among the selected dictionary text segments a second occurrence of the second word, the second occurrence of the second word having a word sense characterization; and attributing to the first occurrence of the second word the word sense characterization of the second occurrence of the second word. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
-
23. An apparatus for characterizing the sense of an occurrence of a polysemous word in a representation of a dictionary, the dictionary comprising a plurality of text segments, the apparatus comprising:
-
a dictionary text segment selection subsystem for selecting a plurality of dictionary text segments each containing a first word; a word occurrence identification subsystem for identifying among the dictionary text segments selected by the dictionary text segment selection subsystem a first and second occurrence of a second word, the second occurrence of the second word having a word sense characterization and the first occurrence of the second word having no word sense characterization; and a sense attribution subsystem for attributing to the first occurrence of the second word identified by the word occurrence identification subsystem the word sense characterization of the second occurrence of the second word identified by the word occurrence identification subsystem.
-
Specification