Font recognition using text localization

US 10,467,508 B2
Filed: 04/25/2018
Issued: 11/05/2019
Est. Priority Date: 10/06/2015
Status: Active Grant

First Claim

Patent Images

1. In a digital medium environment to improve image font recognition through use of text localization, a method implemented by one or more computing devices comprising:

obtaining a model, by the one or more computing devices, that is trained using machine learning as applied to a plurality of training images having text rendered using a corresponding font;

predicting a bounding box, automatically and without user intervention by the one or more computing devices, for text in an image received using the obtained model by forming a plurality of cropped portions of the image and processing each of the plurality of cropped portions of the image by the model independently, one to another, the text overlapping a first and second cropped portion of the plurality of cropped portions; and

generating an indication of the predicted bounding box by the one or more computing devices based on a result of the processing of each of the plurality of cropped portions of the image by calculating an average or a median of a top and bottom line of the predicted bounding box, the indication usable to specify a region of the image that includes the text having a font to be recognized.

View all claims

0 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Font recognition and similarity determination techniques and systems are described. In a first example, localization techniques are described to train a model using machine learning (e.g., a convolutional neural network) using training images. The model is then used to localize text in a subsequently received image, and may do so automatically and without user intervention, e.g., without specifying any of the edges of a bounding box. In a second example, a deep neural network is directly learned as an embedding function of a model that is usable to determine font similarity. In a third example, techniques are described that leverage attributes described in metadata associated with fonts as part of font recognition and similarity determinations.

66 Citations

View as Search Results

20 Claims

1. In a digital medium environment to improve image font recognition through use of text localization, a method implemented by one or more computing devices comprising:
- obtaining a model, by the one or more computing devices, that is trained using machine learning as applied to a plurality of training images having text rendered using a corresponding font;
  
  predicting a bounding box, automatically and without user intervention by the one or more computing devices, for text in an image received using the obtained model by forming a plurality of cropped portions of the image and processing each of the plurality of cropped portions of the image by the model independently, one to another, the text overlapping a first and second cropped portion of the plurality of cropped portions; and
  
  generating an indication of the predicted bounding box by the one or more computing devices based on a result of the processing of each of the plurality of cropped portions of the image by calculating an average or a median of a top and bottom line of the predicted bounding box, the indication usable to specify a region of the image that includes the text having a font to be recognized.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
- - 2. The method as described in claim 1, further comprising recognizing the font of the text in the received image by the one or more computing devices using the generated indication of the predicted bounding box.
  - 3. The method as described in claim 1, wherein the predicting includes processing each of the plurality of cropped portions of the image by a trained convolutional network of the model independently, one to another.
  - 4. The method as described in claim 1, wherein the predicting includes resizing the image by the one or more computing devices to correspond to an image size of the model.
  - 5. The method as described in claim 1, further comprising training the model by the one or more computing devices using the machine learning for a plurality of iterations.
  - 6. The method as described in claim 5, wherein the training is performed for at least one of the plurality of iterations using the plurality of training images having text rendered using the corresponding font and performed for one or more subsequent ones of the plurality of iterations in which one or more perturbations are introduced to the training images.
  - 7. The method as described in claim 6, wherein the perturbations includes at least one of noise, rotation, scale, shading, rotation, kerning, or cropping.
  - 8. The method as described in claim 5, wherein the machine learning is performed by the one or more computing devices using a convolutional neural network, the convolutional neural network is used as an architecture of the machine learning by the one or more computing devices and stochastic gradient decent is used as a training algorithm of the machine learning by the one or more computing devices.
  - 9. The method as described in claim 1, wherein the font to be recognized in the image is arbitrary such that the model is trainable without using the font.

10. In a digital medium environment to improve image font recognition through use of text localization, a system comprising:
- a text localization module implemented at least partially in hardware of at least one computing device to obtain a model that is trained using machine learning as applied to a plurality of training images having text rendered using a corresponding font;
  
  a machine learning module implemented at least partially in the hardware of the at least one computing device to predict a bounding box, automatically and without user intervention, for text in an image by forming a plurality of cropped portions of the image and processing each of the plurality of cropped portions of the image independently, one to another, the text overlapping a first and second cropped portion of the plurality of cropped portions; and
  
  the text localization module further implemented at least partially in the hardware of the at least one computing device to generate an indication of the predicted bounding box based on a result of the processing of each of the plurality of cropped portions of the image by calculating an average or a median of a top and bottom line of the predicted bounding box, the indication is usable to specify a region of the image that includes the text having a font to be recognized.
- View Dependent Claims (11, 12, 13, 14)
- - 11. The system as described in claim 10, wherein the font to be recognized in the image is arbitrary.
  - 12. The system as described in claim 10, further comprising a font similarity and recognition module implemented at least partially in the hardware of the at least one computing device to recognize the font of the text in the image.
  - 13. The system as described in claim 10, wherein the plurality of training images are organized as tuples to minimize a hinge loss function.
  - 14. The system as described in claim 10, wherein at least one training image of the plurality of training images is samples with a probability distribution that includes a normalization factor.

15. In a digital medium environment to improve image font recognition through use of text localization, a system comprising:
- means for obtaining a model that is trained using machine learning as applied to a plurality of training images having text rendered using a corresponding font;
  
  means for predicting a bounding box, automatically and without user intervention, for text in an image received using the obtained model by forming a plurality of cropped portions of the image and processing each of the plurality of cropped portions of the image by the model independently, one to another, the text overlapping a first and second cropped portion of the plurality of cropped portions; and
  
  means for generating an indication of the predicted bounding box based on a result of the processing of each of the plurality of cropped portions of the image by calculating an average or a median of a top and bottom line of the predicted bounding box, the indication is usable to specify a region of the image that includes the text having a font to be recognized.
- View Dependent Claims (16, 17, 18, 19, 20)
- - 16. The system as described in claim 15, further comprising means for recognizing the font in the image.
  - 17. The system as described in claim 15, wherein the font to be recognized in the image is arbitrary such that the model is trainable without using the font.
  - 18. The system as described in claim 15, further comprising means for generating the predicted bounding box.
  - 19. The system as described in claim 15, wherein the plurality of training images are organized as tuples to minimize a hinge loss function.
  - 20. The system as described in claim 15, wherein at least one training image of the plurality of training images is samples with a probability distribution that includes a normalization factor.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Adobe Inc.
Original Assignee
Adobe Inc.
Inventors
Wang, Zhaowen, Liu, Luoqi, Jin, Hailin
Primary Examiner(s)
Park, Edward

Application Number

US15/962,514
Publication Number

US 20180239995A1
Time in Patent Office

559 Days
Field of Search
US Class Current
CPC Class Codes

G06F 18/24137   Distances to cluster centroïds

G06N 3/045   Combinations of networks

G06T 3/40   Scaling of whole images or ...

G06T 7/60   Analysis of geometric attri...

G06V 10/82   using neural networks

G06V 30/10   Character recognition

G06V 30/18057   Integrating the filters int...

G06V 30/19173   Classification techniques

G06V 30/245   Font recognition

Font recognition using text localization

First Claim

0 Assignments

0 Petitions

Accused Products

Abstract

66 Citations

20 Claims

Specification

Use Cases

Quick Links

Others

Font recognition using text localization

First Claim

0 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

66 Citations

20 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others