×

System and method for document section segmentation

  • US 7,818,308 B2
  • Filed: 09/07/2007
  • Issued: 10/19/2010
  • Est. Priority Date: 10/01/2003
  • Status: Active Grant
First Claim
Patent Images

1. An automated computer implemented method for categorizing document section headings comprising the steps of:

  • determining a set of canonical section headings from a set of documents;

    establishing a data set containing said canonical section headings and information associating at least one of the canonical section headings with at least one other section heading that is different from the at least one of the canonical section headings but corresponds to the at least one of the canonical section headings;

    extracting at least one section heading from another document;

    transforming the extracted section heading into a plurality of n-grams, and;

    associating the extracted section heading with a particular one of the canonical section headings in the data set if said plurality of n-grams have a predetermined level of similarity to said particular one of the canonical section headings.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×