×

System and method for adaptive sentence boundary disambiguation

  • US 8,131,546 B1
  • Filed: 12/28/2007
  • Issued: 03/06/2012
  • Est. Priority Date: 01/03/2007
  • Status: Active Grant
First Claim
Patent Images

1. A method for adaptive sentence boundary disambiguation, comprising:

  • receiving, from a natural language processing system, a document containing text;

    identifying, by a first heuristic algorithm, sentence text in the document;

    identifying, by a second heuristic algorithm, non-sentence text in the document, wherein the second heuristic algorithm is operable to identify one of non-sentence texts in a group consisting of lists, tables, names of people, addresses, text without a sentence structure, text included as a list and spatially separated data as non-sentence text;

    parsing said non-sentence text into one or more logical constructs, wherein each logical construct comprises a set of words;

    inserting a disambiguator after each of said one or more logical constructs to define a sentence boundary for the logical construct based on one or more natural language structures; and

    sending the disambiguated document to the natural language processing system, wherein the disambiguated document consists of disambiguated sentences, each disambiguated sentence having a defined boundary and including related contextual information, and the disambiguator is used to signal the natural language processing system the presence of a logical construct to be evaluated independently of other logical constructs.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×