×

SYSTEM AND METHOD FOR DOCUMENT SECTION SEGMENTATION

  • US 20080059498A1
  • Filed: 09/07/2007
  • Published: 03/06/2008
  • Est. Priority Date: 10/01/2003
  • Status: Active Grant
First Claim
Patent Images

1. An automated computer implemented method for categorizing document section headings in a plurality of documents comprising the steps of:

  • determining a set of canonical section headings from subset of said plurality of documents;

    establishing a data base containing said canonical section headings;

    extracting each section heading from the remainder of the documents, transforming said section headings into a plurality of n-grams, and;

    associating particular section heading with a particular canonical section heading in the data base if said n-grams associated with a section heading reach a predetermined level of similarity to said canonical section headings.

View all claims
  • 4 Assignments
Timeline View
Assignment View
    ×
    ×