Reformatting documents using document analysis information
First Claim
Patent Images
1. A method comprising:
- generating a multiresolution segmentation image for an electronic version of a document;
performing a connected components analysis on the multiresolution segmentation image to generate a list of image connected components, along with their locations within the multiresolution segmentation image and multiresolution bit distribution;
performing layout analysis on the electronic version of a document to locate text zones;
assigning attributes to text zones in the electronic version of the document;
creating a list of text components associated with the text zones; and
merging component images associated with the image connected components of the multiresolution segmentation image and the text components.
0 Assignments
0 Petitions
Accused Products
Abstract
A method and apparatus for reformatting electronic documents is disclosed. In one embodiment, the method comprises performing layout analysis on an electronic version of a document to locate text zones, assigning attributes for scale and importance to text zones in the electronic version of the document, and reformatting text in the electronic version of the document based on the attributes to create an image.
-
Citations
18 Claims
-
1. A method comprising:
-
generating a multiresolution segmentation image for an electronic version of a document; performing a connected components analysis on the multiresolution segmentation image to generate a list of image connected components, along with their locations within the multiresolution segmentation image and multiresolution bit distribution; performing layout analysis on the electronic version of a document to locate text zones; assigning attributes to text zones in the electronic version of the document; creating a list of text components associated with the text zones; and merging component images associated with the image connected components of the multiresolution segmentation image and the text components. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
-
-
13. An apparatus comprising:
-
means for generating a multiresolution segmentation image for an electronic version of a document; means for performing a connected components analysis on the multiresolution segmentation image to generate a list of image connected components, along with their locations within the multiresolution segmentation image and multiresolution bit distribution; means for performing layout analysis on the electronic version of a document to locate text zones; means for assigning attributes to text zones in the electronic version of the document; means for creating a list of text components associated with the text zones; and means for merging component images associated with the image connected components of the multiresolution segmentation image and the text components. - View Dependent Claims (14, 15)
-
-
16. An article of manufacture comprising one or more computer readable storage media containing executable instructions that, when executed by a system, cause the system to:
-
generate a multiresolution segmentation image for an electronic version of a document; perform a connected components analysis on the multiresolution segmentation image to generate a list of image connected components, along with their locations within the multiresolution segmentation image and multiresolution bit distribution; perform layout analysis on the electronic version of a document to locate text zones; assign attributes to text zones in the electronic version of the document; create a list of text components associated with the text zones; and merge component images associated with the image connected components of the multiresolution segmentation image and the text components. - View Dependent Claims (17, 18)
-
Specification