×

System and method for identifying document structure and associated metainformation

  • US 7,937,338 B2
  • Filed: 04/30/2008
  • Issued: 05/03/2011
  • Est. Priority Date: 04/30/2008
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method, comprising:

  • receiving at least one document;

    identifying sections and associated section types within said at least one document;

    identifying sub-sections within said at least one document;

    defining new section types and new sub-section heading constructs when sections having known section types are identified; and

    learning new section heading keywords when sections having known section types are identified, wherein learning new section heading keywords comprises;

    receiving new section headings;

    receiving at least one stop word;

    filtering said at least one stop word from new section headings and ingesting new section heading keywords; and

    outputting section heading keywords.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×