Label-embedding for text recognition
First Claim
1. A method for comparing a text image and a character string comprising:
- embedding a character string into a vectorial space, comprising extracting a set of features from the character string and generating a character string representation based on the extracted character string features;
embedding a text image into a vectorial space, comprising extracting a set of features from the text image and generating a text image representation based on the extracted text image features; and
computing a compatibility between the text image representation and character string representation comprising computing a function of the text image representation and character string representation, the function including an embedding parameter w which is a DE-dimensional vector or a D×
E matrix W which embeds the text image representation and character string representation into a new space, where D is the dimensionality of the text image representation and E is the dimensionality of the character string representation,wherein at least one of the embedding and the computing of the compatibility is performed with a processor.
1 Assignment
0 Petitions
Accused Products
Abstract
A system and method for comparing a text image and a character string are provided. The method includes embedding a character string into a vectorial space by extracting a set of features from the character string and generating a character string representation based on the extracted features, such as a spatial pyramid bag of characters (SPBOC) representation. A text image is embedded into a vectorial space by extracting a set of features from the text image and generating a text image representation based on the text image extracted features. A compatibility between the text image representation and the character string representation is computed, which includes computing a function of the text image representation and character string representation.
-
Citations
24 Claims
-
1. A method for comparing a text image and a character string comprising:
-
embedding a character string into a vectorial space, comprising extracting a set of features from the character string and generating a character string representation based on the extracted character string features; embedding a text image into a vectorial space, comprising extracting a set of features from the text image and generating a text image representation based on the extracted text image features; and computing a compatibility between the text image representation and character string representation comprising computing a function of the text image representation and character string representation, the function including an embedding parameter w which is a DE-dimensional vector or a D×
E matrix W which embeds the text image representation and character string representation into a new space, where D is the dimensionality of the text image representation and E is the dimensionality of the character string representation,wherein at least one of the embedding and the computing of the compatibility is performed with a processor. - View Dependent Claims (2, 4, 5, 6, 7, 8, 9, 14, 15, 16, 17, 18, 19, 20, 21)
-
-
3. A method for comparing a text image and a character string comprising:
-
embedding a character string into a vectorial space, comprising extracting a set of features from the character string and generating a character string representation based on the extracted character string features; embedding a text image into a vectorial space, comprising extracting a set of features from the text image and generating a text image representation based on the extracted text image features; and computing a compatibility between the text image representation and character string representation comprising computing a function of the text image representation and character string representation, wherein the function is of the form;
F(x,y;
W)=θ
(x)TWφ
(y)
(5)where F represents the compatibility between an image and a character string, given a matrix of weights W, θ
(x) represents one of the text image representation and the character string representation and φ
(y) represents the other of the text image representation and the character string representation, and T represents the transpose operator,or where an approximation W≈
UTV is used, with Uε
R×
D, Vε
R×
E, where R<
D, is a function of the form;
F(x,y;
W)=(Uθ
(x))T(Vφ
(y))
(6); andwherein at least one of the embedding and the computing of the compatibility is performed with a processor.
-
-
10. A method for comparing a text image and a character string comprising:
-
embedding a character string into a vectorial space, comprising; extracting a spatial pyramid bag-of-characters comprising partitioning the character string into a plurality of regions, for each of the regions, extracting features based on the characters present in the region, extracting a representation of each of the regions based on the respective extracted features, and generating a character string representation, the character string representation being derived from the region representations; embedding a text image into a vectorial space, comprising extracting a set of features from the text image and generating a text image representation based on the extracted text image features; and computing a compatibility between the text image representation and character string representation comprising computing a function of the text image representation and character string representation, wherein at least one of the embedding and the computing of the compatibility is performed with a processor. - View Dependent Claims (11, 12, 13)
-
-
22. A system for comparing a text image and a character string comprising:
-
a text string representation generator for generating a character string representation based on features extracted from a character string, the character string consisting of a sequence of characters, the character string representation comprising a spatial pyramid bag of characters representation generated by partitioning the character string into at least two regions and partitioning each of the at least two regions into at least two smaller regions, the character string representation being based on representations of features extracted from the regions; a text image representation generator for generating a text image representation based on features extracted from a text image; a comparator for computing a compatibility between the text image representation and the character string representation; an output component for outputting information based on the computed compatibility between at least one character string representation and at least one text image representation; and a processor which implements the text string representation generator, text image representation generator;
comparator, and output component.
-
-
23. A method for comparing a text image and a character string comprising:
-
for at least one character string comprising a sequence of characters, extracting a set of features from the character string, comprising partitioning the character string to form a spatial pyramid of regions, and for each region, generating a representation of the region comprising counting occurrences of each of a finite set of characters in the region and generating a region representation based on the counts; generating a character string representation based on the region representations; extracting a set of features from the text image and generating a text image representation based thereon; and computing a compatibility between the text image representation and the character string representation comprising embedding at least one of the character string representation and the text image representation with a matrix of learned parameters, the compatibility being a function of the at least one embedded representation, wherein at least one of the embedding and the computing of the compatibility is performed with a processor. - View Dependent Claims (24)
-
Specification