×

System and method to facilitate the association of structured content in a structured document with unstructured content in an unstructured document

  • US 9,684,691 B1
  • Filed: 08/25/2015
  • Issued: 06/20/2017
  • Est. Priority Date: 08/30/2012
  • Status: Active Grant
First Claim
Patent Images

1. A system configured to facilitate associating structured content in a structured document with unstructured content in an unstructured document, the system comprising:

  • one or more processors configured to;

    obtain an unstructured document, and generate a structured document from the unstructured document, wherein the structured document has content related to the unstructured document, wherein the structured document is generated from the unstructured document by;

    analyzing human-readable textual content in the unstructured document,segmenting the human-readable textual content into individual unstructured content fragments based on the analysis, the unstructured content fragments being contiguous sections of textual content in the unstructured document,applying tags to the unstructured content fragments, andgenerating the structured document based on the tags and the unstructured content fragments such that the segmented unstructured content fragments from the unstructured document correspond to the structured content fragments in the structured document;

    identify numeric instances present in the structured document and the unstructured document, wherein the numeric instances include a first set of numeric instances present in the structured document and a second set of numeric instances present in the unstructured document, wherein the first set of numeric instances includes a first numeric instance appearing in the structured document within a first structured content fragment;

    determine uniqueness of the individual ones of the first set of numeric instances present in the structured document, wherein a unique numeric instance expresses a unique first number; and

    correlate structured content fragments in the structured document with unstructured content fragments in the unstructured document responsive to the first numeric instance being unique, such that the first numeric instance in the structured document is correlated with an unstructured content fragment in the unstructured document.

View all claims
  • 2 Assignments
Timeline View
Assignment View
    ×
    ×