System and method to facilitate the association of structured content in a structured document with unstructured content in an unstructured document
First Claim
1. A system configured to facilitate associating structured content in a structured document with unstructured content in an unstructured document, the system comprising:
- one or more processors configured to;
obtain an unstructured document, and generate a structured document from the unstructured document, wherein the structured document has content related to the unstructured document, wherein the structured document is generated from the unstructured document by;
analyzing human-readable textual content in the unstructured document,segmenting the human-readable textual content into individual unstructured content fragments based on the analysis, the unstructured content fragments being contiguous sections of textual content in the unstructured document,applying tags to the unstructured content fragments, andgenerating the structured document based on the tags and the unstructured content fragments such that the segmented unstructured content fragments from the unstructured document correspond to the structured content fragments in the structured document;
identify numeric instances present in the structured document and the unstructured document, wherein the numeric instances include a first set of numeric instances present in the structured document and a second set of numeric instances present in the unstructured document, wherein the first set of numeric instances includes a first numeric instance appearing in the structured document within a first structured content fragment;
determine uniqueness of the individual ones of the first set of numeric instances present in the structured document, wherein a unique numeric instance expresses a unique first number; and
correlate structured content fragments in the structured document with unstructured content fragments in the unstructured document responsive to the first numeric instance being unique, such that the first numeric instance in the structured document is correlated with an unstructured content fragment in the unstructured document.
2 Assignments
0 Petitions
Accused Products
Abstract
This disclosure relates to facilitating the association of structured content in a structured document with unstructured content in an unstructured document. The system described herein may be configured to facilitate the association by linking numeric instances in the structured document to corresponding numeric instances in the unstructured document. In some implementations, the system may be configured to link the numeric instances in the structured document to the corresponding numeric instances the unstructured document based on a uniqueness of the numeric instances in the structured document, structural information assigned to non-unique numeric instances, structural information assigned to unique numeric instances related to the non-unique numeric instances, unstructured contextual information related to non-unique numeric instances, and/or other information. In some implementations, the system may include one or more of one or more processors, a user interface, a display, electronic storage, and/or other components.
-
Citations
22 Claims
-
1. A system configured to facilitate associating structured content in a structured document with unstructured content in an unstructured document, the system comprising:
-
one or more processors configured to; obtain an unstructured document, and generate a structured document from the unstructured document, wherein the structured document has content related to the unstructured document, wherein the structured document is generated from the unstructured document by; analyzing human-readable textual content in the unstructured document, segmenting the human-readable textual content into individual unstructured content fragments based on the analysis, the unstructured content fragments being contiguous sections of textual content in the unstructured document, applying tags to the unstructured content fragments, and generating the structured document based on the tags and the unstructured content fragments such that the segmented unstructured content fragments from the unstructured document correspond to the structured content fragments in the structured document; identify numeric instances present in the structured document and the unstructured document, wherein the numeric instances include a first set of numeric instances present in the structured document and a second set of numeric instances present in the unstructured document, wherein the first set of numeric instances includes a first numeric instance appearing in the structured document within a first structured content fragment; determine uniqueness of the individual ones of the first set of numeric instances present in the structured document, wherein a unique numeric instance expresses a unique first number; and correlate structured content fragments in the structured document with unstructured content fragments in the unstructured document responsive to the first numeric instance being unique, such that the first numeric instance in the structured document is correlated with an unstructured content fragment in the unstructured document. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method to facilitate associating structured content in a structured document with unstructured content in an unstructured document, the method comprising:
-
obtaining an unstructured document, and generate a structured document from the unstructured document, wherein the structured document has content related to the unstructured document, wherein the structured document is generated from the unstructured document by; analyzing human-readable textual content in the unstructured document, segmenting the human-readable textual content into individual unstructured content fragments based on the analysis, the unstructured content fragments being contiguous sections of textual content in the unstructured document, applying tags to the unstructured content fragments, and generating the structured document based on the tags and the unstructured content fragments such that the segmented unstructured content fragments from the unstructured document correspond to the structured content fragments in the structured document; identifying numeric instances present in the structured document and the unstructured document, wherein the numeric instances include a first set of numeric instances present in the structured document and a second set of numeric instances present in the unstructured document, wherein the first set of numeric instances includes a first numeric instance appearing in the structured document within a first structured content fragment; determining uniqueness of the individual ones of the first set of numeric instances present in the structured document, wherein a unique numeric instance expresses a unique first number; and correlating structured content fragments in the structured document with unstructured content fragments in the unstructured document responsive to the first numeric instance being unique, such that the first numeric instance in the structured document is correlated with an unstructured content fragment in the unstructured document. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22)
-
Specification