Natural language parser with dictionary-based part-of-speech probabilities
First Claim
1. In a parser of a natural language processing system, a method comprising the following steps:
- examining individual dictionary entries for corresponding words in a dictionary;
counting, for an individual dictionary entry, a number of senses listed in the dictionary entry which are associated with a part of speech; and
deriving a part-of-speech probability indicative of how likely a dictionary entry is to be a particular part of speech based upon the number of senses associated with the particular part of speech.
2 Assignments
0 Petitions
Accused Products
Abstract
A natural language parser determines part-of-speech probabilities by using a dictionary or other lexicon as a source for the part-of-speech probabilities. A machine-readable dictionary is scanned, word-by-word. For each word, the number of senses listed for the word and associated with a part of speech are counted. A part-of-speech probability is then computed for each part of speech based upon the number of senses counted. The part-of-speech probability is indicative of how likely the word is to assume a particular part of speech in a text. The most probable parts of speech are then used by a parser during the first parse of an input string of text to improve the parser'"'"'s accuracy and efficiency.
-
Citations
48 Claims
-
1. In a parser of a natural language processing system, a method comprising the following steps:
-
examining individual dictionary entries for corresponding words in a dictionary; counting, for an individual dictionary entry, a number of senses listed in the dictionary entry which are associated with a part of speech; and deriving a part-of-speech probability indicative of how likely a dictionary entry is to be a particular part of speech based upon the number of senses associated with the particular part of speech. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. In a natural language processing system for determining which part of speech a word is likely to be in a natural language text, the word being listed in a dictionary with multiple senses attributed thereto, the senses reflecting multiple different parts of speech that the word can assume in different contexts, a method comprising the following steps:
-
counting a number of senses listed in the dictionary for each part of speech that the word can assume; and deriving a part-of-speech probability indicative of how likely the word is to be a particular part of speech based upon the number of senses counted in conjunction with the particular part of speech. - View Dependent Claims (15, 16, 17, 18, 19, 20, 21)
-
-
22. In a natural language processing system for determining which part of speech a word is likely to be in a natural language text, the word being listed in a dictionary with multiple senses attributed thereto, the senses reflecting multiple different parts of speech that the word can assume in different contexts, a method comprising the following steps:
-
counting a number of senses listed in the dictionary for each part of speech that the word can assume; and using the number of senses counted for each part of speech as an indication of how likely the word is to be a particular part of speech. - View Dependent Claims (23, 24, 25, 26)
-
-
27. In a natural language processing system, a method comprising the following steps:
-
generating, for lexemes listed as dictionary entries in a dictionary, inflected forms of the lexemes; for each lexeme, counting a number of senses for each part of speech attributable to the lexeme in the dictionary; for each inflected form, counting a number of senses for each part of speech attributable to the inflected form and adding, for each part of speech, the number of senses attributable to the inflected form and the number of senses attributable to the lexeme from which the inflected form is generated; and deriving, for each lexeme and inflected form, a part-of-speech probability indicative of how likely the lexeme or inflected form is to be a particular part of speech based upon the senses counted in said counting steps. - View Dependent Claims (28, 29, 30, 31, 32, 33, 34, 35, 36)
-
-
37. A method for parsing a natural language text comprising the following steps:
-
counting a number of senses listed in a dictionary that are associated with a part of speech; deriving a part-of-speech probability as a function of the number of senses associated with the part of speech; and choosing a part of speech for a word in the text based upon the part-of-speech probability. - View Dependent Claims (38, 39, 40, 41, 42)
-
-
43. A method for parsing a natural language text to determine which part of speech a word assumes within the text comprising the following steps:
-
counting a number of senses listed in a dictionary which are associated with a part of speech for the word; determining the part of speech with a highest number of senses listed in the dictionary; and choosing, for an initial parse, the part of speech for the word with the highest number of senses.
-
-
44. An apparatus for determining which part of speech a word is likely to be in a natural language text, comprising:
-
a sense counter to scan words from a machine-readable dictionary and to count, for each word, a number of senses associated with each part of speech attributable to the word; and a computational unit to compute, for each word, part-of-speech probabilities indicative of how likely the word is to be particular parts of speech based upon the number of senses counted by the sense counter. - View Dependent Claims (45, 46, 47, 48)
-
Specification