IDENTIFICATION OF COMPOUND GRAPHIC ELEMENTS IN AN UNSTRUCTURED DOCUMENT
First Claim
1. A computer readable medium storing a computer program which when executed by at least one processor analyzes a document, the computer program comprising sets of instructions for:
- receiving the document that comprises a plurality of primitive graphic elements, each primitive graphic element defined as a single object in the document, the document having a drawing order that indicates the order in which the primitive graphic elements are drawn when the document is displayed;
identifying positional relationships between successive primitive graphic elements in the drawing order; and
based on the positional relationships, defining a single structural graphic element from at least two of the primitive graphic elements.
1 Assignment
0 Petitions
Accused Products
Abstract
Some embodiments provide a method of analyzing an unstructured document. The method receiving the unstructured document that includes a number of primitive graphic elements, each of which is defined as a single object in the unstructured document. The unstructured document has a drawing order that indicates the order in which the primitive graphic elements are drawn when the unstructured document is displayed. The method identifies positional relationships between successive primitive graphic elements in the drawing order. Based on the positional relationships, the method defines a single structural graphic element from several of the primitive graphic elements.
91 Citations
25 Claims
-
1. A computer readable medium storing a computer program which when executed by at least one processor analyzes a document, the computer program comprising sets of instructions for:
-
receiving the document that comprises a plurality of primitive graphic elements, each primitive graphic element defined as a single object in the document, the document having a drawing order that indicates the order in which the primitive graphic elements are drawn when the document is displayed; identifying positional relationships between successive primitive graphic elements in the drawing order; and based on the positional relationships, defining a single structural graphic element from at least two of the primitive graphic elements. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19)
-
-
20. A method for defining a program for analyzing a document and generating structural elements that define structure in the document based on the analysis, the method comprising:
-
defining a module for receiving the document that comprises a plurality of primitive graphic elements, each primitive graphic element defined as a single object in the document, the document having a drawing order that indicates the order in which the primitive graphic elements are drawn when the document is displayed; defining a module for identifying positional relationships between successive primitive graphic elements in the drawing order; and defining a module for, based on the positional relationships, defining a single structural graphic element from at least two of the primitive graphic elements. - View Dependent Claims (21, 22, 24, 25)
-
-
23. A computer readable medium storing a computer program which when executed by at least one processor analyzes a document, the computer program comprising sets of instructions for:
-
receiving the document that comprises a plurality of primitive graphic elements, each primitive graphic element defined as a single object in the document, the document having a drawing order that indicates the order in which the primitive graphic elements are drawn when the unstructured document is displayed; calculating values for each pair of successive primitive graphic elements in the drawing order, wherein the calculated values relate to a size of the primitive graphic elements in the pair; based on the calculated values, defining a cluster of successive primitive graphic elements; and identifying a set of sub-clusters of primitive graphic elements in the cluster that satisfy particular constraints; and defining each particular sub-cluster as a single structural graphic element comprising the primitive graphic elements in the particular sub-cluster.
-
Specification