Ingestion plan based on table uniqueness
First Claim
1. A computer implemented method for processing tabular data, the method comprising:
- receiving an electronic document on a computer through a network;
receiving metadata associated with the received electronic document;
identifying a plurality of tabular data markers, in response to analyzing the received electronic document and associated metadata;
identifying references for association with the identified plurality of tabular data markers;
generating a graphical representation of the relationship between the identified tabular data markers and identified references;
calculating a uniqueness score value based on the generated graphical representation; and
generating an ingestion plan for the received electronic documents based on the calculated uniqueness score value.
1 Assignment
0 Petitions
Accused Products
Abstract
Embodiments of the present invention disclose a method, computer program product, and system for a computer implemented method for processing tabular data. In various embodiments, an electronic document is received through a network, along with associated metadata. A plurality of table markers, or tabular data markers, are identified, in response to analyzing the received electronic document for said markers. References and citations associated with the plurality of tabular data markers are identified. A graphical representation of the relationship between identified tabular data markers and the identified references is generated. A uniqueness score is calculated, based on the generated graph and an ingestion plan is generated for the received electronic documents based on the calculated uniqueness score value.
12 Citations
18 Claims
-
1. A computer implemented method for processing tabular data, the method comprising:
-
receiving an electronic document on a computer through a network; receiving metadata associated with the received electronic document; identifying a plurality of tabular data markers, in response to analyzing the received electronic document and associated metadata; identifying references for association with the identified plurality of tabular data markers; generating a graphical representation of the relationship between the identified tabular data markers and identified references; calculating a uniqueness score value based on the generated graphical representation; and generating an ingestion plan for the received electronic documents based on the calculated uniqueness score value. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer program product for processing tabular data, the computer program product comprising:
-
a computer-readable storage media having program instructions stored on the computer-readable storage media, the program instructions, executable by a device, comprising; instructions to receive an electronic document through a network; instructions to receive metadata associated with the received electronic document; instructions to identify a plurality of tabular data markers, in response to analyzing the received electronic document and associated metadata; instructions to identify references for association with the identified plurality of tabular data markers; instructions to generate a graphical representation of the relationship between the identified tabular data markers and identified references; instructions to calculate a uniqueness score value based on the generated graphical representation; and instructions to generate an ingestion plan for the received electronic documents based on the calculated uniqueness score value. - View Dependent Claims (8, 9, 10, 11, 12)
-
-
13. A computer system for processing tabular data, the computer system comprising:
-
one or more computer processors; one or more computer-readable storage media; program instructions stored on the computer-readable storage media for execution by at least one of the one or more processors, the program instructions comprising; instructions to A computer implemented method for processing tabular data, the method comprising; instructions to receive an electronic document through a network; instructions to receive metadata associated with the received electronic document; instructions to identify a plurality of tabular data markers, in response to analyzing the received electronic document and associated metadata; instructions to identify references for association with the identified plurality of tabular data markers; instructions to generate a graphical representation of the relationship between the identified tabular data markers and identified references; instructions to calculate a uniqueness score value based on the generated graphical representation; and instructions to generate an ingestion plan for the received electronic documents based on the calculated uniqueness score value. - View Dependent Claims (14, 15, 16, 17, 18)
-
Specification