Font attributes for font recognition and similarity

US 10,699,166 B2
Filed: 12/22/2017
Issued: 06/30/2020
Est. Priority Date: 10/06/2015
Status: Active Grant

First Claim

Patent Images

1. In a digital medium environment to recognize a font in rendered text in an image or determine similarity of the font in the rendered text in the image to other fonts, a method implemented by a computing device, the method comprising:

predicting, automatically and without user intervention by the computing device, a bounding box for the rendered text in the image using a model that is trained using machine learning as applied to a plurality of training images having text rendered using the font;

normalizing coordinates of boundaries of the rendered text using the font in the plurality of training images; and

generating, by the computing device, an indication of the predicted bounding box, the indication specifying a region of the image that includes the rendered text having the font to be recognized.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Font recognition and similarity determination techniques and systems are described. In a first example, localization techniques are described to train a model using machine learning (e.g., a convolutional neural network) using training images. The model is then used to localize text in a subsequently received image, and may do so automatically and without user intervention, e.g., without specifying any of the edges of a bounding box. In a second example, a deep neural network is directly learned as an embedding function of a model that is usable to determine font similarity. In a third example, techniques are described that leverage attributes described in metadata associated with fonts as part of font recognition and similarity determinations.

Citations

20 Claims

1. In a digital medium environment to recognize a font in rendered text in an image or determine similarity of the font in the rendered text in the image to other fonts, a method implemented by a computing device, the method comprising:
- predicting, automatically and without user intervention by the computing device, a bounding box for the rendered text in the image using a model that is trained using machine learning as applied to a plurality of training images having text rendered using the font;
  
  normalizing coordinates of boundaries of the rendered text using the font in the plurality of training images; and
  
  generating, by the computing device, an indication of the predicted bounding box, the indication specifying a region of the image that includes the rendered text having the font to be recognized.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
- - 2. The method as recited in claim 1, wherein the text rendered using the font in the plurality of training images include one or more perturbations added to the text rendered using the font in the plurality of training images.
  - 3. The method as recited in claim 1, wherein the normalized coordinates of the boundaries of the rendered text are used as ground truth for the machine learning.
  - 4. The method as recited in claim 1, wherein the model is a convolutional neural network learned by a stochastic gradient decent technique.
  - 5. The method as recited in claim 1, wherein the predicting the bounding box further includes applying a horizontal squeeze to the image, an amount of the horizontal squeeze matching a training setting of the model.
  - 6. The method as recited in claim 5, wherein the predicting the bounding box further includes:
    - forming overlapping cropped images from the squeezed image;
      
      feeding the cropped images into the model; and
      
      obtaining a bounding box prediction for the cropped images using the model.
  - 7. The method as recited in claim 6, wherein the generating the indication of the predicted bounding box further includes calculating top and bottom lines for the obtained bounding box prediction for the cropped images.
  - 8. The method as recited in claim 1, further comprising determining a rotation of text in the image.

9. In a digital medium environment to recognize a font in rendered text in an image or determine similarity of the font in the rendered text in the image to other fonts, a system implemented by at least one computing device, the system comprising:
- a training set generation module implemented at least partially in hardware of a computing device to;
  
  generate a training image and a font collection that includes additional training images, andrender text in the training image and the additional training images in the font collection using a selection of fonts, the rendered text including one or more perturbations; and
  
  a machine learning module implemented at least partially in hardware of the computing device to obtain the training image and the font collection and train a model to predict, automatically and without user intervention, bounding boxes using machine learning applied to the training image and the font collection.
- View Dependent Claims (10, 11, 12, 13, 14, 15)
- - 10. The system as described in claim 9, wherein the training set generation module is further configured to generate the training image and the additional images in the font collection as synthetic images using a selection of fonts.
  - 11. The system as described in claim 9, wherein the training set generation module is further configured to:
    - generate the training image and the font collection including the additional training images using the rendered text including the one or more perturbations.
  - 12. The system as described in claim 9, wherein the model is a convolutional neural network, and wherein the machine learning module is further configured to train the convolutional neural network by a stochastic gradient decent technique.
  - 13. The system as described in claim 9, further comprising:
    - a text localization module implemented at least partially in the hardware of the computing device to;
      
      obtain the model;
      
      predict a bounding box for the image using the obtained model, in part, by applying horizontal squeeze to the image, an amount of the horizontal squeeze matching a training setting of the obtained model; and
      
      generate an indication of the predicted bounding box specifying a region of the image that includes text having a font to be recognized.
  - 14. The system as described in claim 13, wherein the predicting the bounding box further includes:
    - forming overlapping cropped images from the squeezed image;
      
      feeding the cropped images into the model; and
      
      obtaining a bounding box prediction for the cropped images using the model.
  - 15. The system as described in claim 14, wherein the generating the indication of the predicted bounding box further includes calculating top and bottom lines for the obtained bounding box prediction for the cropped images.

16. In a digital medium environment to recognize a font in rendered text in an image or determine similarity of the font in the rendered text in the image to other fonts, a system implemented by at least one computing device, the system comprising:
- means for obtaining a plurality of training images having a text rendered using a font;
  
  means for training a model to;
  
  predict, automatically and without user intervention, bounding boxes for text in images, the model trained using machine learning as applied to the plurality of training images having text rendered using the font, andnormalize coordinates of boundaries of the text rendered using the font in the plurality of training images;
  
  means for predicting, automatically and without user intervention, a bounding box for the rendered text in the image using the model; and
  
  means for generating an indication of the predicted bounding box, the indication specifying a region of the image that includes the rendered text having a font to be recognized.
- View Dependent Claims (17, 18, 19, 20)
- - 17. The system as described in claim 16, wherein the text rendered using the font in the plurality of training images include one or more perturbations added to the text rendered using the font in the plurality of training images.
  - 18. The system as described in claim 16, the normalized coordinates of the boundaries of the text are used as ground truth for the machine learning.
  - 19. The system as described in claim 16, wherein the model is a convolutional neural network learned by a stochastic gradient decent technique.
  - 20. The system as described in claim 16, wherein the means for generating the indication is further configured to determine a rotation of text in the image.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Adobe Inc.
Original Assignee
Adobe Inc.
Inventors
Wang, Zhaowen, Liu, Luoqi, Jin, Hailin
Primary Examiner(s)
Park, Edward

Application Number

US15/853,120
Publication Number

US 20180114097A1
Time in Patent Office

921 Days
Field of Search
US Class Current
CPC Class Codes

G06F 18/2148   characterised by the proces...

G06F 18/24137   Distances to cluster centroïds

G06F 18/2415   based on parametric or prob...

G06N 3/045   Combinations of networks

G06V 10/82   using neural networks

G06V 30/10   Character recognition

G06V 30/18057   Integrating the filters int...

G06V 30/19173   Classification techniques

G06V 30/245   Font recognition

Font attributes for font recognition and similarity

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Font attributes for font recognition and similarity

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links