WORD-BASED DOCUMENT IMAGE COMPRESSION
First Claim
Patent Images
1. A method, comprising:
- by a computer ascertaining locations of word images corresponding to words in a document image;
by the computer grouping the word images into clusters;
for each of multiple of the clusters, determining by the computer a respective compressed word image cluster based on a joint compression of respective ones of the word images that are grouped into the cluster; and
by the computer associating the positions of the word images in the document image with the respective ones of the compressed word image clusters corresponding to the clusters respectively containing the word images.
1 Assignment
0 Petitions
Accused Products
Abstract
Locations of word images corresponding to words in a document image are ascertained. The word images are grouped into clusters. For each of multiple of the clusters, a respective compressed word image cluster is determined based on a joint compression of respective ones of the word images that are grouped into the cluster. The positions of the word images in the document image are associated with the respective ones of the compressed word image clusters corresponding to the clusters respectively containing the word images.
21 Citations
20 Claims
-
1. A method, comprising:
-
by a computer ascertaining locations of word images corresponding to words in a document image; by the computer grouping the word images into clusters; for each of multiple of the clusters, determining by the computer a respective compressed word image cluster based on a joint compression of respective ones of the word images that are grouped into the cluster; and by the computer associating the positions of the word images in the document image with the respective ones of the compressed word image clusters corresponding to the clusters respectively containing the word images. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
-
-
14. A method, comprising:
-
by a computer receiving a set of compressed word image clusters each comprising a joint compression of a respective cluster of word images; by the computer receiving associations between respective ones of the word images in the respective joint compressions of the compressed word image clusters and positions of respective ones of the word images in a document image; by the computer extracting the word images from respective ones of the compressed word image clusters; rendering a version of the document image based on the extracted word images and the associations between the compressed word image clusters and the positions of the word images in the document image. - View Dependent Claims (15, 16, 17, 18)
-
-
19. Apparatus, comprising:
-
a computer-readable medium storing computer-readable instructions; and a data processor coupled to the computer-readable medium, operable to execute the instructions, and based at least in part on the execution of the instructions operable to perform operations comprising ascertaining locations of word images corresponding to words in a document image; grouping the word images into clusters; for each of multiple of the clusters, determining a respective compressed word image cluster based on a joint compression of respective ones of the word images that are grouped into the cluster; and associating the positions of the word images in the document image with the respective ones of the compressed word image clusters corresponding to the clusters respectively containing the word images.
-
-
20. At least one computer-readable medium having computer-readable program code embodied therein, the computer-readable program code adapted to be executed by a computer to implement a method comprising:
-
ascertaining locations of word images corresponding to words in a document image; grouping the word images into clusters; for each of multiple of the clusters, determining a respective compressed word image cluster based on a joint compression of respective ones of the word images that are grouped into the cluster; and associating the positions of the word images in the document image with the respective ones of the compressed word image clusters corresponding to the clusters respectively containing the word images.
-
Specification