Utilizing deep learning for rating aesthetics of digital images

US 10,002,415 B2
Filed: 04/12/2016
Issued: 06/19/2018
Est. Priority Date: 04/12/2016
Status: Active Grant

First Claim

Patent Images

1. A computer-implemented method of estimating aesthetic quality of digital images using deep learning, the method comprising:

receiving a plurality of training images with associated user provided ratings;

sampling the plurality of training images to identify pairs of training images; and

training a neural network to output aesthetic quality scores for identified pairs of training images that, for a given pair of training images, minimizes a difference between predicted user ratings and average user ratings of the user provided ratings for the respective training images in the given pair of training images while maintaining a relative difference between the associated user provided ratings of the training images in the given pair of training images.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Systems and methods are disclosed for estimating aesthetic quality of digital images using deep learning. In particular, the disclosed systems and methods describe training a neural network to generate an aesthetic quality score digital images. In particular, the neural network includes a training structure that compares relative rankings of pairs of training images to accurately predict a relative ranking of a digital image. Additionally, in training the neural network, an image rating system can utilize content-aware and user-aware sampling techniques to identify pairs of training images that have similar content and/or that have been rated by the same or different users. Using content-aware and user-aware sampling techniques, the neural network can be trained to accurately predict aesthetic quality ratings that reflect subjective opinions of most users as well as provide aesthetic scores for digital images that represent the wide spectrum of aesthetic preferences of various users.

29 Citations

View as Search Results

20 Claims

1. A computer-implemented method of estimating aesthetic quality of digital images using deep learning, the method comprising:
- receiving a plurality of training images with associated user provided ratings;
  
  sampling the plurality of training images to identify pairs of training images; and
  
  training a neural network to output aesthetic quality scores for identified pairs of training images that, for a given pair of training images, minimizes a difference between predicted user ratings and average user ratings of the user provided ratings for the respective training images in the given pair of training images while maintaining a relative difference between the associated user provided ratings of the training images in the given pair of training images.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
- - 2. The method as recited in claim 1, wherein the training the neural network comprises constructing a training structure including a pairwise loss model and a regression loss model, wherein:
    - the pairwise loss model compares the relative difference between the associated user provided ratings for the identified pairs of training images; and
      
      the regression loss model minimizes the difference between the predicted user ratings and the average user ratings for the plurality of training images.
  - 3. The method as recited in claim 2, wherein minimizing the difference between predicted user ratings and the average user ratings for the plurality of training images comprises minimizing a Euclidean loss between an average user rating of the user provided ratings for the plurality of training images and predicted user ratings for the plurality of training images.
  - 4. The method as recited in claim 2, wherein generating an aesthetic quality score comprises summing outputs of the regression loss model and the pairwise loss model.
  - 5. The method as recited in claim 1, wherein sampling the plurality of training images to identify pairs of training images comprises identifying the pairs of training images from the plurality of training images based on an identity of one or more users that rated each training image from the plurality of training images.
  - 6. The method as recited in claim 5, wherein sampling the plurality of training images to identify pairs of training images comprises identifying pairs of training images from the plurality of training images that have been rated by a common user.
  - 7. The method as recited in claim 1, wherein sampling the plurality of training images to identify pairs of training images comprises identifying pairs of training images from the plurality of training images having a predetermined difference between user ratings.
  - 8. The method as recited in claim 7, wherein the predetermined difference between user ratings differs based on whether images of the pairs of training images are associated with user ratings from a common user or different users.
  - 9. The method as recited in claim 1, wherein sampling the plurality of training images to identify pairs of training images comprises identifying pairs of training images having a common type of content.
  - 10. The method as recited in claim 1, wherein sampling the plurality of training images to identify pairs of training images comprises identifying pairs of training images having a predetermined difference between user ratings based on whether the images of the pairs of training images have a common type of content or different type of content.
  - 11. The method as recited in claim 1, wherein sampling the plurality of training images to identify pairs of training images comprises identifying pairs of training images having a threshold number of common attributes that have been identified by users that rated the plurality of training images.
  - 12. The method as recited in claim 1, further comprising:
    - utilizing the trained neural network to generate aesthetic quality scores for a collection of input digital images; and
      
      categorizing the collection of input digital images based on the generated aesthetic quality scores.

13. A non-transitory computer readable storage medium storing instructions thereon that, when executed by at least one processor, cause a computer system to:
- receive a digital image; and
  
  generate an aesthetic quality score for the digital image and an attribute quality score for each of a plurality of attributes of the digital image using a neural network having a training structure that jointly learns low level parameters for pairs of training images of a plurality of training images and includes an attribute model for each of the plurality of attributes that utilizes the jointly learned low level parameters and outputs an attribute quality score for a given attribute.
- View Dependent Claims (14, 15, 16, 17, 18)
- - 14. The non-transitory computer readable medium of claim 13, wherein the training structure further comprises a regression loss model that minimizes a difference between predicted user ratings and user provided ratings for the plurality of training images.
  - 15. The non-transitory computer readable medium of claim 14, wherein minimizing the difference between predicted user ratings and user provided ratings for the plurality of training images comprises minimizing a Euclidean loss between a predicted overall quality rating and an average rating of the user provided ratings for each of the plurality of training images.
  - 16. The non-transitory computer readable medium of claim 13, wherein the attribute model for each of the plurality of attributes minimizes a difference between a predicted rating for the given attribute and user provided ratings for the given attribute.
  - 17. The non-transitory computer readable medium of claim 13, wherein the training structure further comprises a pairwise loss model that:
    - compares a relative difference between user provided ratings for selected pairs of training images from the plurality of training images; and
      
      maintains the relative difference between the user provided ratings for the selected pairs of training images from the plurality of training images.
  - 18. The non-transitory computer readable medium of claim 13, wherein the plurality of attributes comprise two or more of:
    - interesting content, object emphasis, lighting, color harmony, vivid color, depth of an image field, motion blur, rule of thirds, balancing element, repetition, or symmetry.

19. A system for analyzing digital images to estimate aesthetic quality of the digital images using deep learning, the system comprising:
- at least one processor;
  
  a non-transitory storage medium comprising instructions that, when executed by the at least one processor, cause the system to;
  
  receive a plurality of training images with user provided ratings;
  
  sample the plurality of training images to identify pairs of images that are rated by one or more common users, pairs of images having a common type of content, or pairs of images that are rated by different users; and
  
  train a neural network to output aesthetic quality scores for identified pairs of training images that, for a given pair of training images, minimizes a difference between predicted user ratings and average user ratings of associated user provided ratings for the respective training images in the given pair of training images while maintaining a relative difference between the associated user provided ratings of the given pair of training images.
- View Dependent Claims (20)
- - 20. The system as recited in claim 19, wherein the instructions, when executed by the at least one processor, cause the system to sample the plurality of training images to identify the pairs of training images having a common type of content that have been rated by one or more common users.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Adobe Inc.
Original Assignee
Adobe Systems Incorporated (Adobe Inc.)
Inventors
Shen, Xiaohui, Lin, Zhe, Kong, Shu, Mech, Radomir
Primary Examiner(s)
Tsai, Tsung-Yin

Application Number

US15/097,113
Publication Number

US 20170294010A1
Time in Patent Office

798 Days
Field of Search
US Class Current
CPC Class Codes

G06F 18/2113   by ranking or filtering the...

G06F 18/214   Generating training pattern...

G06N 3/045   Combinations of networks

G06N 3/08   Learning methods

G06T 2207/20081   Training; Learning

G06T 2207/20084   Artificial neural networks ...

G06T 2207/30168   Image quality inspection

G06T 7/0002   Inspection of images, e.g. ...

G06V 10/764   using classification, e.g. ...

G06V 10/82   using neural networks

Utilizing deep learning for rating aesthetics of digital images

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

29 Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Utilizing deep learning for rating aesthetics of digital images

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

29 Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links