Font Recognition using Text Localization
First Claim
1. In a digital medium environment to improve image font recognition through use of text localization, a method implemented by one or more computing devices comprising:
- obtaining a model, by the one or more computing devices, that is trained using machine learning as applied to a plurality of training images having text rendered using a corresponding font;
predicting a bounding box, automatically and without user intervention by the one or more computing devices, for text in an image received using the obtained model by forming a plurality of cropped portions of the image and processing each of the plurality of cropped portions of the image by the model independently, one to another; and
generating an indication of the predicted bounding box by the one or more computing devices, the indication usable to specify a region of the image that includes the text having a font to be recognized.
0 Assignments
0 Petitions
Accused Products
Abstract
Font recognition and similarity determination techniques and systems are described. In a first example, localization techniques are described to train a model using machine learning (e.g., a convolutional neural network) using training images. The model is then used to localize text in a subsequently received image, and may do so automatically and without user intervention, e.g., without specifying any of the edges of a bounding box. In a second example, a deep neural network is directly learned as an embedding function of a model that is usable to determine font similarity. In a third example, techniques are described that leverage attributes described in metadata associated with fonts as part of font recognition and similarity determinations.
8 Citations
20 Claims
-
1. In a digital medium environment to improve image font recognition through use of text localization, a method implemented by one or more computing devices comprising:
-
obtaining a model, by the one or more computing devices, that is trained using machine learning as applied to a plurality of training images having text rendered using a corresponding font; predicting a bounding box, automatically and without user intervention by the one or more computing devices, for text in an image received using the obtained model by forming a plurality of cropped portions of the image and processing each of the plurality of cropped portions of the image by the model independently, one to another; and generating an indication of the predicted bounding box by the one or more computing devices, the indication usable to specify a region of the image that includes the text having a font to be recognized. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. In a digital medium environment to improve image font recognition through use of text localization, a method implemented by one or more computing devices comprising:
-
obtaining a model, by the one or more computing devices, that is trained using machine learning over a plurality of iterations using a plurality of training image, at least one iteration of the plurality of iterations using the plurality of training images having text rendered using the corresponding font and performed for one or more subsequent ones of the plurality of iterations in which one or more perturbations are introduced to the training images; predicting a bounding box, automatically and without user intervention by the one or more computing devices, for text in an image received using the obtained model; and generating an indication of the predicted bounding box by the one or more computing devices, the indication usable to specify a region of the image that includes the text having a font to be recognized. - View Dependent Claims (13, 14, 15, 16, 17, 18, 19)
-
-
20. In a digital medium environment to train a model to improve image font recognition through use of text localization, a system comprising one or more computing devices including a processing system and memory having instructions stored thereon that are executable by the processing system to perform operations comprising:
-
obtaining a plurality of training images having text rendered using a corresponding font images including; an anchor image having text rendered using a corresponding font type; the positive image having text that is different than the text of the anchor image or text having one or more applied perturbations; and the negative image having text that is not in the font type; and training the model to predict a bounding box for text in an image, the model trained using machine learning as applied to the plurality of training images having text rendered using the corresponding font.
-
Specification