Image document processing in a client-server system including privacy-preserving text recognition
First Claim
1. A computerized device comprising:
- a network interface connected to a network; and
a processor in communication with the network interface and performing the following;
analyzing an image document to identify at least one text region within the image document and at least one word image contained in the at least one text region;
for each word image, randomly selecting a shuffling pattern, resizing the word image to a predetermined size so that a height and width of the word image are equal to a height and width of a grid of cells that is associated with the shuffling pattern, overlaying the grid of cells onto the word image such that the cells contain portions of the word image, and shuffling positions of the cells within the grid according to the shuffling pattern to move the portions of the word image and create a corresponding scrambled image; and
replacing all word images in the image document with corresponding scrambled images to generate an obfuscated document,the network interface communicating the obfuscated document to a computer server over the network, wherein the computer server is capable of individually evaluating the scrambled images using trained software to recognize specific words.
1 Assignment
0 Petitions
Accused Products
Abstract
Disclosed are devices and methods for processing an image document in a client-server environment such that privacy of text information contained in the image document is preserved. Specifically, in a client-server environment, an image document can be processed using a local computerized device of a client to create an obfuscated document by identifying word images in the image document and scrambling those word images. The obfuscated document can be received by a server of a service provider over a network (e.g., the Internet) and processed by previously trained software (e.g., a previously trained convolutional neural network (CNN)) to recognize specific words represented by the scrambled images in the obfuscated document without having to reconstruct the image document. Since the image document is neither communicated over the network, nor reconstructed and stored on the server, privacy concerns are minimized.
-
Citations
18 Claims
-
1. A computerized device comprising:
-
a network interface connected to a network; and a processor in communication with the network interface and performing the following; analyzing an image document to identify at least one text region within the image document and at least one word image contained in the at least one text region; for each word image, randomly selecting a shuffling pattern, resizing the word image to a predetermined size so that a height and width of the word image are equal to a height and width of a grid of cells that is associated with the shuffling pattern, overlaying the grid of cells onto the word image such that the cells contain portions of the word image, and shuffling positions of the cells within the grid according to the shuffling pattern to move the portions of the word image and create a corresponding scrambled image; and replacing all word images in the image document with corresponding scrambled images to generate an obfuscated document, the network interface communicating the obfuscated document to a computer server over the network, wherein the computer server is capable of individually evaluating the scrambled images using trained software to recognize specific words. - View Dependent Claims (2, 3, 4)
-
-
5. A computer server comprising:
-
a network interface receiving, from a computerized device over a network, an obfuscated document, the obfuscated document created by the computerized device from an image document comprising at least one word image, and the obfuscated document comprising at least one scrambled image, each scrambled image corresponding to a single word image in the image document; a processor in communication with the network interface and processing the obfuscated document, the processing comprising evaluating each specific scrambled image individually to recognize a specific word represented by the specific scrambled image, and the processing being performed without reconstructing the image document; and a memory storing multiple trained convolutional neural networks, each trained convolutional neural network having been initially developed using a database of images and then fine-tuned using scrambled word images acquired by scrambling specific word images from a specific vocabulary set using a specific shuffling pattern, each word in the specific vocabulary set being associated with a corresponding output class number, and the processor processing the obfuscated document by executing a selected trained convolutional neural network to produce a specific output class number for the specific scrambled image and, thereby to recognize the specific word associated with the specific output class number and represented by the specific scrambled image. - View Dependent Claims (6, 7, 8, 9)
-
-
10. A method comprising:
-
analyzing, by a processor of a computerized device, an image document to identify at least one text region and at least one word image contained in the at least one text region; for each word image, performing, by the processor, the following;
randomly selecting a shuffling pattern, resizing the word image to a predetermined size so that a height and width of the word image are equal to a height and width of a grid of cells that is associated with the shuffling pattern, overlaying of the grid of cells onto the word image such that the cells contain portions of the word image, and shuffling of positions of the cells within the grid according to the shuffling patterning to move the portions of the word image and create a corresponding scrambled image;replacing, by the processor, all word images in the image document with corresponding scrambled images to generate an obfuscated document; and
,communicating, by a network interface of the computerized device, the obfuscated document to a computer server over a network, wherein the computer server is capable of individually evaluating the scrambled images using trained software to recognize specific words. - View Dependent Claims (11, 12, 13)
-
-
14. A method comprising:
-
storing, in a memory of a computer server, multiple trained convolutional neural networks, each trained convolutional neural network having been initially developed using a database of images and then fine-tuned using scrambled word images acquired by scrambling specific word images from a specific vocabulary set using a specific shuffling pattern, and each word in the specific vocabulary set being associated with a corresponding output class number; receiving, by a network interface of the computer server from a computerized device over a network, an obfuscated document, the obfuscated document created by the computerized device from an image document comprising at least one word image, and the obfuscated document comprising at least one scrambled image, each scrambled image corresponding to a single word image in the image document; and
,processing, by a processor of the computer server, the obfuscated document without reconstructing the image document, the processing comprising evaluating each specific scrambled image individually to recognize a specific word represented by the specific scrambled image and the processing of the obfuscated document further comprising executing a selected trained convolutional neural network to produce a specific output class number for the specific scrambled image and, thereby to recognize the specific word associated with the specific output class number and represented by the specific scrambled image. - View Dependent Claims (15, 16, 17, 18)
-
Specification