Identification of Compound Graphic Elements in an Unstructured Document
0 Assignments
0 Petitions
Accused Products
Abstract
Some embodiments provide a method of analyzing an unstructured document. The method receiving the unstructured document that includes a number of primitive graphic elements, each of which is defined as a single object in the unstructured document. The unstructured document has a drawing order that indicates the order in which the primitive graphic elements are drawn when the unstructured document is displayed. The method identifies positional relationships between successive primitive graphic elements in the drawing order. Based on the positional relationships, the method defines a single structural graphic element from several of the primitive graphic elements.
18 Citations
46 Claims
-
1-25. -25. (canceled)
-
26. A non-transitory machine readable medium storing a program which when executed by at least one processing unit analyzes a document, the program comprising sets of instructions for:
-
receiving a document that comprises a plurality of primitive graphic elements defined separately within the document, the document having a drawing order that indicates the order in which the primitive graphic elements are drawn when the document is displayed; calculating a first value for a first primitive graphic element and a second primitive graphic element that is subsequent to the first in the drawing order; and based on the comparison of the first value to other values calculated for additional primitive graphic elements that are subsequent in the drawing order, defining a single structural graphic element within the document from the first and second primitive graphic elements. - View Dependent Claims (27, 28, 29, 30, 31)
-
-
32. A non-transitory machine readable medium storing a program which when executed by at least one processing unit analyzes a document, the program comprising sets of instructions for:
-
receiving a document that comprises a plurality of primitive graphic elements defined separately within the document; based on values calculated for pairs of primitive graphic elements, defining a set of successive primitive graphic elements as a candidate compound graphic element; using bounds of the primitive graphic elements in the cluster to identify one or more subsets of primitive graphic elements within the set; and defining each particular subset having more than one primitive graphic element as a single structural graphic element within the document, the structure graphic element comprising the primitive graphic elements in the particular subset. - View Dependent Claims (33, 34, 35, 36)
-
-
37. A method for analyzing a document, the method comprising:
-
receiving a document that comprises a plurality of primitive graphic elements defined separately within the document, the document having a drawing order that indicates the order in which the primitive graphic elements are drawn when the document is displayed; calculating a first value for a first primitive graphic element and a second primitive graphic element that is subsequent to the first in the drawing order; and based on the comparison of the first value to other values calculated for additional primitive graphic elements that are subsequent in the drawing order, defining a single structural graphic element within the document from the first and second primitive graphic elements. - View Dependent Claims (38, 39, 40, 41)
-
-
42. An apparatus comprising:
-
a set of processing units; and a machine readable medium storing a program which when executed by at least one processing unit analyzes a document, the program comprising sets of instructions for; receiving a document that comprises a plurality of primitive graphic elements defined separately within the document; based on values calculated for pairs of primitive graphic elements, defining a set of successive primitive graphic elements as a candidate compound graphic element; using bounds of the primitive graphic elements in the cluster to identify one or more subsets of primitive graphic elements within the set; and defining each particular subset having more than one primitive graphic element as a single structural graphic element within the document, the structure graphic element comprising the primitive graphic elements in the particular subset. - View Dependent Claims (43, 44, 45, 46)
-
Specification