×

Acronym extraction system and method of identifying acronyms and extracting corresponding expansions from text

  • US 7,236,923 B1
  • Filed: 08/07/2002
  • Issued: 06/26/2007
  • Est. Priority Date: 08/07/2002
  • Status: Active Grant
First Claim
Patent Images

1. A system for identifying abbreviated terms within text each representing a corresponding phrase of at least one term and extracting expansions of said abbreviated terms from said text in the form of said corresponding phrases comprising:

  • a processing system to receive said text and identify abbreviated terms and corresponding expansions therein, said processing system including;

    an identification module to examine said text to identify at least one abbreviated term residing therein;

    an expansion retrieval module to retrieve at least one portion of said text for an identified abbreviated term, wherein each retrieved text portion is located within said text proximate said identified abbreviated term; and

    an expansion extraction module to compare said identified abbreviated term with at least one corresponding retrieved text portion to extract an expansion therefrom for said abbreviated term and to verify said extracted expansion to produce a valid expansion for said identified abbreviated term, wherein said expansion extraction module includes;

    an expansion initialization module to examine a retrieved text portion for said identified abbreviated term and selectively produce at least one subset of said retrieved text portion for identifying and extracting said expansion, wherein said at least one subset is produced based on a comparison of an initial portion of said identified abbreviated term with initial portions of terms within said retrieved text portion; and

    a search module to iteratively scan a retrieved text portion subset and compare successive portions of said identified abbreviated term to at least one term within a corresponding search window for said abbreviated term portion to identify corresponding expansion terms within that subset for said abbreviated term portions, wherein said search window includes a predetermined number of terms from said retrieved text portion subset and is movable within that subset, and wherein a current abbreviated term portion and corresponding search window are respectively combined for a subsequent scan iteration with at least one prior abbreviated term portion and corresponding search window that identify a corresponding expansion term in response to failure of said current abbreviated term portion and corresponding search window to identify a corresponding expansion term.

View all claims
  • 13 Assignments
Timeline View
Assignment View
    ×
    ×