Identification of Regions of a Document
First Claim
Patent Images
1. A computer readable medium storing a computer program which when executed by at least one processor analyzes a document that comprises a plurality of primitive elements, the computer program comprising sets of instructions for:
- identifying boundaries between sets of primitive elements;
identifying regions bounded by the boundaries; and
defining a structured document based on the regions and the primitive elements.
1 Assignment
0 Petitions
Accused Products
Abstract
Some embodiments provide a for analyzing a document that includes a number of primitive elements. The method identifies boundaries between sets of primitive elements and identifies regions bounded by the boundaries. The method uses the identified regions to define structural elements for the document. The method defines a structured document based on the primitive elements and the structural elements.
173 Citations
35 Claims
-
1. A computer readable medium storing a computer program which when executed by at least one processor analyzes a document that comprises a plurality of primitive elements, the computer program comprising sets of instructions for:
-
identifying boundaries between sets of primitive elements; identifying regions bounded by the boundaries; and defining a structured document based on the regions and the primitive elements. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)
-
-
19. A method for defining a program for (i) analyzing a document that comprises a plurality of primitive elements and (ii) generating structural elements that define structure in said document based on said analysis, the method comprising:
-
defining a module for identifying boundaries between sets of primitive elements; defining a module for identifying regions bounded by the boundaries; defining a module for using the identified boundaries and regions to specify the structural elements. - View Dependent Claims (20, 21)
-
-
22. A computer readable medium storing a computer program which when executed by at least one processor analyzes a document that comprises a plurality of primitive elements, the primitive elements comprising a plurality of glyphs and a plurality of graphical elements the computer program comprising sets of instructions for:
-
identifying a plurality of the graphical elements as potential boundaries; identifying a portion of the potential boundaries as actual boundaries; traversing the actual boundaries to identify one or more zones; and defining a hierarchical document model with the identified zones. - View Dependent Claims (23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35)
-
Specification