Adaptive and scalable method for resolving natural language ambiguities

  • US 7,475,010 B2
  • Filed: 09/02/2004
  • Issued: 01/06/2009
  • Est. Priority Date: 09/03/2003
  • Status: Active Grant
  • ×
    • Pin Icon | RPX Insight
    • Pin
First Claim
Patent Images

1. A method for resolving natural language ambiguities within text documents on a computer system comprising a processor and memory that would cause the processor to perform the following, comprising the steps of:

  • i. training probabilistic classifiers from annotated training data containing a sense tag for each polysemous word;

    ii. processing said text documents into tokens and determining their part-of-speech tags;

    iii computing a measure of confidence using said probabilistic classifiers for each known sense of said tokens defined within a semantic lexicon based on contextual features and assigning a default sense for tokens absent from said semantic lexicon based on their part-of-speech tags;

    iv. determining assignment of word senses for each said token in said sentence such that the combined probability across said sentence is maximized; and

    v. integrating additional contextual features as generated by one or more of the following natural language processing modules into said probabilistic classifiers whereby said measure of confidence is improved;

    using a chunking module to identify multi-word phrases and the associated measure of confidence for each phrase;

    using a named-entity recognition module to identify named entities and the associated measure of confidence for each entity;

    using a syntactic parsing module to construct sentential parse trees and the associated measure of confidence for each tree;

    using an anaphora resolution module to identify anaphor references and the associated measure of confidence for each reference;

    using a discourse categorization module to determine document categories and the associated measure of confidence for each category;

    using a discourse structure analysis module to determine discourse structures and the associated measure of confidence for each structure.

View all claims
    ×
    ×

    Thank you for your feedback

    ×
    ×