Privacy-preserving text to image matching
First Claim
1. A method for text-to-image matching comprising:
- storing a set of text image representations, each text image representation having been generated by embedding a respective text image into a first vectorial space with a first embedding function;
with a second embedding function, embedding a character string into a second vectorial space to generate a character string representation;
for each of at least some of the text image representations, computing a compatibility between the character string representation and the text image representation, comprising computing a function of the text image representation, character string representation, and a transformation, the transformation having being derived by minimizing a loss function on a set of labeled training images, the loss function including a text-to-image-loss and an image-to-text loss; and
identifying a matching text image based on the computed compatibilities,wherein at least one of the embedding and the computing of the compatibility is performed with a processor.
4 Assignments
0 Petitions
Accused Products
Abstract
A method for text-to-image matching includes generating representations of text images, such as license plate images, by embedding each text image into a first vectorial space with a first embedding function. With a second embedding function, a character string, such as a license plate number to be matched, is embedded into a second vectorial space to generate a character string representation. A compatibility is computed between the character string representation and one or more of the text image representations to identify a matching one. The compatibility is computed with a function that uses a transformation which is learned on a training set of labeled images. The learning uses a loss function that aggregates a text-to-image-loss and an image-to-text loss over the training set. The image-to-text loss penalizes the transformation when it correctly ranks a pair of character string representations, given an image representation corresponding to one of them.
50 Citations
22 Claims
-
1. A method for text-to-image matching comprising:
-
storing a set of text image representations, each text image representation having been generated by embedding a respective text image into a first vectorial space with a first embedding function; with a second embedding function, embedding a character string into a second vectorial space to generate a character string representation; for each of at least some of the text image representations, computing a compatibility between the character string representation and the text image representation, comprising computing a function of the text image representation, character string representation, and a transformation, the transformation having being derived by minimizing a loss function on a set of labeled training images, the loss function including a text-to-image-loss and an image-to-text loss; and identifying a matching text image based on the computed compatibilities, wherein at least one of the embedding and the computing of the compatibility is performed with a processor. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)
-
-
15. A method for learning a compatibility function for use in a text-to-image matching system comprising:
-
for each of a set of training images, each labeled with a respective character string, generating a text image representation by embedding the training image into a first vectorial space with a first embedding function; with a second embedding function, embedding the character strings into a second vectorial space to generate character string representations; learning a compatibility function which is a function of a compatibility matrix, a character string representation of a character string to be matched, and a text image representation for one of a set of text images, the learning including identifying a compatibility matrix which minimizes a loss function on the set of labeled training images, the loss function aggregating a text-to-image-loss and an image-to-text loss; and outputting the learned compatibility function including the compatibility matrix which minimizes the loss function, wherein at least one of the embedding and the learning of the compatibility function is performed with a processor. - View Dependent Claims (16, 17, 18, 19, 20)
-
-
21. A system comprising:
-
a text image representation generator which, for each of a set of training images, each labeled with a respective character string, generates a text image representation by embedding the training image into a first vectorial space with a first embedding function; a text string representation generator which embeds the character strings into a second vectorial space with a second embedding function to generate character string representations; a training component which learns a compatibility function which is a function of a compatibility matrix, a character string representation of a character string to be matched, and a text image representation for one of a set of text images, the learning including identifying a compatibility matrix which minimizes a loss function on the set of labeled training images, the loss function aggregating, over the set of images, a text-to-image-loss and an image-to-text loss; a processor which implements the text image representation generator, text string representation generator, and training component. - View Dependent Claims (22)
-
Specification