Apparatus and a method for logically processing a composite graph in a formatted document
First Claim
1. An apparatus for logically processing a composite graph in a formatted document, comprising a computer processor and a computer readable storage medium which stores a plurality of computer-executable instructions, wherein the computer-executable instructions, when being executed by the computer processor, cause the computer processor to:
- extract a composite graph block in the formatted document;
parse the formatted document to obtain a text element included therein;
extract a cutline element from the text element;
detect correlativity between the composite graph block and the extracted cutline element, wherein the correlativity detection comprises;
determining the number of in-text-illustration composite graphs included in the composite graph block;
in the case that the composite graph block only contains one in-text-illustration composite graph, selecting a cutline element having a distance to the one in-text-illustration composite graph smaller than a preset distance, and making the selected cutline element as a cutline element relating to the one in-text-illustration composite graph; and
in the case that the composite graph block contains multiple in-text-illustration composite graphs, making the multiple in-text-illustration composite graphs and all of the parsed cutline elements as a vertex of a bipartite graph respectively, so as to use the bipartite graph to determine the correlativity between the multiple in-text-illustration composite graphs and the cutline elements; and
store the detected correlativity.
1 Assignment
0 Petitions
Accused Products
Abstract
The present invention provides an apparatus for logically processing a composite graph in a formatted document, the apparatus comprising: a composite graph block extraction unit, used to extract a composite graph block in the formatted document; a document parsing unit, used to parse the formatted document to obtain a text element contained therein; a cutline element extraction unit, used to extract a cutline element from the text element; a correlativity detection unit, used to detect correlativity between the composite graph block and the cutline element; a correlativity storage unit, used to store the detected correlativity. The present invention also provides a method for logically processing a composite graph in a formatted document. According to the technical scheme disclosed in the present invention, it is easily achieve layout understanding of the composite graph in a graph-text mixed layout of the formatted document, so as to avoid a logical error.
-
Citations
8 Claims
-
1. An apparatus for logically processing a composite graph in a formatted document, comprising a computer processor and a computer readable storage medium which stores a plurality of computer-executable instructions, wherein the computer-executable instructions, when being executed by the computer processor, cause the computer processor to:
-
extract a composite graph block in the formatted document; parse the formatted document to obtain a text element included therein; extract a cutline element from the text element; detect correlativity between the composite graph block and the extracted cutline element, wherein the correlativity detection comprises; determining the number of in-text-illustration composite graphs included in the composite graph block; in the case that the composite graph block only contains one in-text-illustration composite graph, selecting a cutline element having a distance to the one in-text-illustration composite graph smaller than a preset distance, and making the selected cutline element as a cutline element relating to the one in-text-illustration composite graph; and in the case that the composite graph block contains multiple in-text-illustration composite graphs, making the multiple in-text-illustration composite graphs and all of the parsed cutline elements as a vertex of a bipartite graph respectively, so as to use the bipartite graph to determine the correlativity between the multiple in-text-illustration composite graphs and the cutline elements; and store the detected correlativity. - View Dependent Claims (2, 3, 4)
-
-
5. A method for logically processing a composite graph in a formatted document, comprising:
-
extracting, by an apparatus, a composite graph block in a formatted document; extracting a cutline element from a text element parsed from the formatted document; detecting correlativity between the composite graph block and the extracted cutline element, wherein the correlativity detection comprises; if the composite graph block only contains one in-text-illustration composite graph, selecting a cutline element having a distance to the one in-text-illustration composite graph smaller than a preset distance, and using the cutline element having the distance to the one in-text-illustration composite graph smaller than the preset distance as a cutline element correlating to the one in-text-illustration composite graph; and if the composite graph block contains multiple in-text-illustration composite graphs, making the multiple in-text-illustration composite graphs and all of the parsed cutline elements as a vertex of a bipartite graph respectively, so as to use the bipartite graph to determine the correlativity between the multiple in-text-illustration composite graphs and the cutline elements; and store the detected correlativity. - View Dependent Claims (6, 7, 8)
-
Specification