Word Sense Disambiguation Using Emergent Categories
First Claim
1. A computer implemented method for word sense disambiguation in a natural language sentence, comprising the steps of:
- parsing said natural language sentence, comprising the steps of;
identifying one or more possible parts of speech for each term in the natural language sentence;
identifying one or more possible phrase structures in the natural language sentence;
identifying terms comprising one or more linguistic roles in the natural language sentence by generating declared patterns;
identifying possible sense combinations for said identified terms with said linguistic roles in the natural language sentence, comprising the steps of;
applying emergent categories to identify possible valid senses for each of the identified terms comprising the linguistic roles in the natural language sentence, wherein said emergent categories identify a set of senses for terms in a dictionary, wherein said senses in one of the emergent categories corresponds to the senses in one of the other emergent categories by a correspondence function, wherein said correspondence function identifies a linguistic correspondence between two senses;
providing an emergent categories database comprising a plurality of correspondence functions, wherein each of said correspondence functions comprising a given correspondence function type identifies two emergent categories, wherein said correspondence function type specifies a linguistic role pair, wherein said linguistic role pair is a pairing of two linguistic roles, wherein the senses in each of said two emergent categories play one of said two linguistic roles in the correspondence function type;
identifying linguistic role pairs from among the identified terms with the linguistic roles in the natural language sentence for identifying pair-wise terms using said emergent categories database;
identifying the correspondence functions in the emergent categories database with correspondence function types matching said identified linguistic role pairs, wherein for each of the linguistic role pairs, the emergent categories identified by the correspondence function are valid for the corresponding linguistic roles, wherein the emergent categories specify one or more senses representing terms matching said identified pair-wise terms in the natural language sentence, wherein each sense in one of the emergent categories in said identified correspondence function is a possible valid pair-wise sense for the term in the natural language sentence when paired with the other emergent categories in the identified correspondence function;
comparing pair-wise senses for each term with the identified linguistic roles in the natural language sentence to identify said possible sense combinations; and
inferring possible senses for each term with the identified linguistic roles in the natural language sentence and previous sentences;
whereby said inference of said possible senses enables word sense disambiguation in the natural language sentence.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed herein is a computer implemented method and system for word sense disambiguation in a natural language sentence. The natural language sentence is parsed for identifying possible parts of speech for each term and identifying possible phrase structures. Terms comprising one or more linguistic roles are identified. The possible sense combinations for the terms with linguistic roles are identified. Emergent categories are applied to identify possible valid senses for each of the terms with identified linguistic roles. Linguistic role pairs are identified from among the terms identified with linguistic roles. The correspondence functions with the correspondence function types matching the identified linguistic role pairs are identified from an emergent categories database. The pair-wise senses for each term are compared with the identified linguistic roles to identify the possible sense combinations. The possible senses are inferred for each term with identified linguistic roles in the natural language sentence and previous sentences.
58 Citations
15 Claims
-
1. A computer implemented method for word sense disambiguation in a natural language sentence, comprising the steps of:
-
parsing said natural language sentence, comprising the steps of; identifying one or more possible parts of speech for each term in the natural language sentence; identifying one or more possible phrase structures in the natural language sentence; identifying terms comprising one or more linguistic roles in the natural language sentence by generating declared patterns; identifying possible sense combinations for said identified terms with said linguistic roles in the natural language sentence, comprising the steps of; applying emergent categories to identify possible valid senses for each of the identified terms comprising the linguistic roles in the natural language sentence, wherein said emergent categories identify a set of senses for terms in a dictionary, wherein said senses in one of the emergent categories corresponds to the senses in one of the other emergent categories by a correspondence function, wherein said correspondence function identifies a linguistic correspondence between two senses; providing an emergent categories database comprising a plurality of correspondence functions, wherein each of said correspondence functions comprising a given correspondence function type identifies two emergent categories, wherein said correspondence function type specifies a linguistic role pair, wherein said linguistic role pair is a pairing of two linguistic roles, wherein the senses in each of said two emergent categories play one of said two linguistic roles in the correspondence function type; identifying linguistic role pairs from among the identified terms with the linguistic roles in the natural language sentence for identifying pair-wise terms using said emergent categories database; identifying the correspondence functions in the emergent categories database with correspondence function types matching said identified linguistic role pairs, wherein for each of the linguistic role pairs, the emergent categories identified by the correspondence function are valid for the corresponding linguistic roles, wherein the emergent categories specify one or more senses representing terms matching said identified pair-wise terms in the natural language sentence, wherein each sense in one of the emergent categories in said identified correspondence function is a possible valid pair-wise sense for the term in the natural language sentence when paired with the other emergent categories in the identified correspondence function; comparing pair-wise senses for each term with the identified linguistic roles in the natural language sentence to identify said possible sense combinations; and inferring possible senses for each term with the identified linguistic roles in the natural language sentence and previous sentences; whereby said inference of said possible senses enables word sense disambiguation in the natural language sentence. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A computer implemented system for word sense disambiguation in a natural language sentence, comprising:
a word sense disambiguation system, comprising; a natural language sentence parser for parsing said natural language sentence, comprising; a parts of speech tagger for identifying parts of speech for each term in the natural language sentence; a sentence chunker used for identifying one or more possible phrase structures in the natural language sentence; a first identification module for identifying terms comprising one or more linguistic roles in the natural language sentence; a second identification module for identifying possible sense combinations of said identified terms with said linguistic roles in the natural language sentence; and a sense inference module for inferring possible senses for each term with identified linguistic roles in the natural language sentence and previous sentences. - View Dependent Claims (11, 12, 13, 14)
-
15. A computer program product comprising computer executable instructions embodied in a computer-readable medium, where in said computer program product comprises:
-
a first computer parsable program code for parsing a natural language sentence; a second computer parsable program code for identifying one or more possible parts of speech for each term in said natural language sentence; a third computer parsable program code for identifying one or more possible phrase structures in the natural language sentence; a fourth computer parsable program code for identifying terms comprising one or more linguistic roles in the natural language sentence by generating declared patterns; a fifth computer parsable program code for identifying possible sense combinations for said identified terms with said linguistic roles in the natural language sentence; and a sixth computer parsable program code for inferring possible senses for each term with identified linguistic roles in the natural language sentence and previous sentences.
-
Specification