×

System and method for processing document

  • US 10,540,426 B2
  • Filed: 07/11/2012
  • Issued: 01/21/2020
  • Est. Priority Date: 07/11/2011
  • Status: Active Grant
First Claim
Patent Images

1. A computer-implemented method comprising:

  • parsing, on a processor, text from at least a portion of a document, wherein the document does not include meta-data indicating a hierarchical structure within the document;

    identifying, on the processor, a plurality of hierarchical indicators from the text parsed from the at least a portion of the document, wherein each hierarchical indicator includes one or more portions, the one or more portions including a prefix, a stem, or a suffix;

    analyzing, on the processor, the one or more portions of each of the plurality of hierarchical indicators to determine an alphanumerical value associated with each of the plurality of hierarchical indicators;

    analyzing, on the processor, the one or more portions of each of the plurality of hierarchical indicators to determine an alphanumerical numbering style associated with each of the plurality of hierarchical indicators;

    determining, on the processor, a hierarchical level for each of the plurality of hierarchical indicators based upon, at least in part, one or more of the alphanumerical value and the alphanumerical numbering style associated with each of the plurality of hierarchical indicators, wherein the hierarchical level indicates a respective position within the hierarchical structure of the document and is determined for a respective hierarchical indicator prior to determining the hierarchical level associated with a following hierarchical indicator, wherein determining a hierarchical level associated with each of the plurality of hierarchical indicators further includes;

    determining whether a current hierarchical indicator follows a preceding hierarchical indicator based upon, at least in part, one or more of the alphanumerical value and the alphanumerical numbering style associated with the current hierarchical indicator and the preceding hierarchical indicator, anddetermining an alternative interpretation of one or more of the determined alphanumerical value and the determined alphanumerical numbering style associated with the preceding hierarchical indicator in response to determining that the current hierarchical indicator does not follow the preceding hierarchical indicator;

    associating, on the processor, one or more portions of the document with the respective determined hierarchical level associated with each of the plurality of hierarchical indicators to determine a hierarchical structure for the document, wherein associating the one or more portions of the document with the respective hierarchical level includes associating meta-data with the one or more portions of the document, wherein the meta-data indicates the respective hierarchical level.

View all claims
  • 1 Assignment
Timeline View
Assignment View
    ×
    ×