Context-based disambiguation of acronyms and abbreviations
First Claim
1. A method for context-based disambiguation of abbreviations, comprising:
- determining a target abbreviation and one or more keywords appearing in context with the target abbreviation in a received passage, the target abbreviation representing a shortened form of one or more words;
generating automatically by a processor, a contextual search query comprising the target abbreviation and said one or more keywords;
searching, by the processor, a pseudo document index for one or more expansions of the target abbreviation by invoking the contextual search query, the pseudo document index containing index of one or more pseudo documents by titles, associated one or more abbreviations, and associated context keywords, wherein the titles are the expansions of the abbreviations contained in the pseudo documents respectively;
returning one or more target pseudo documents associated with the target abbreviation based on the searching of the pseudo document index; and
providing one or more expansions associated with the target abbreviation based on the returned one or more target pseudo documents,wherein the determining the target abbreviation and one or more keywords appearing in context with the target abbreviation in a received passage comprises generating one or more features that capture lexical and syntactic properties of the passage, and recognizing said target abbreviation and said one or more keywords appearing in context with the target abbreviation in the received passage based on the captured lexical and syntactic properties.
1 Assignment
0 Petitions
Accused Products
Abstract
Context-based disambiguation of acronyms and/or abbreviations may determine a target abbreviation and one or more keywords appearing in context with the target abbreviation in a received passage, the target abbreviation representing a shortened form of one or more word. A contextual search query including the target abbreviation and said one or more keywords may be generated. A pseudo document index may be searched for one or more expansions of the target abbreviation by invoking the contextual search query, the pseudo document index containing index of one or more pseudo documents, associated one or more abbreviations and associated context keywords. One or more pseudo documents associated with the target abbreviation may be returned based on the searching of the pseudo document index.
-
Citations
7 Claims
-
1. A method for context-based disambiguation of abbreviations, comprising:
-
determining a target abbreviation and one or more keywords appearing in context with the target abbreviation in a received passage, the target abbreviation representing a shortened form of one or more words; generating automatically by a processor, a contextual search query comprising the target abbreviation and said one or more keywords; searching, by the processor, a pseudo document index for one or more expansions of the target abbreviation by invoking the contextual search query, the pseudo document index containing index of one or more pseudo documents by titles, associated one or more abbreviations, and associated context keywords, wherein the titles are the expansions of the abbreviations contained in the pseudo documents respectively; returning one or more target pseudo documents associated with the target abbreviation based on the searching of the pseudo document index; and providing one or more expansions associated with the target abbreviation based on the returned one or more target pseudo documents, wherein the determining the target abbreviation and one or more keywords appearing in context with the target abbreviation in a received passage comprises generating one or more features that capture lexical and syntactic properties of the passage, and recognizing said target abbreviation and said one or more keywords appearing in context with the target abbreviation in the received passage based on the captured lexical and syntactic properties. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A method for context-based disambiguation of abbreviations, comprising:
-
generating an abbreviation expansion dictionary by identifying a set of abbreviations with associated potential expansions; generating, by a processor, a pseudo document for each expansion identified in the abbreviation expansion dictionary, the pseudo document comprising an abbreviation, associated expansion and one or more words that occur with said abbreviation, the generated pseudo document having a title corresponding to the associated expansion of the abbreviation that the pseudo document contains, the pseudo document generated at least by extracting data from sources that contain language commonly occurring with the expansion; generating a pseudo document index indexing said abbreviation and said associated expansion; and generating a machine learning classification model by generating one or more features that capture lexical and syntactic properties of a received passage, and building the machine learning classification model for recognizing one or more target abbreviations and one or more target keywords appearing in context with the target abbreviation in the received passage.
-
Specification