Systems and Methods for Providing Convolutional Neural Network Based Image Synthesis Using Stable and Controllable Parametric Models, a Multiscale Synthesis Framework and Novel Network Architectures
First Claim
1. A system for generating a synthesized image including desired content presented in a desired style comprising:
- one or more processors;
memory readable by the one or more processors; and
instructions stored in the memory that when read by the one or more processors direct the one or more processors to;
receive a source content image that includes desired content for a synthesized image,receive a source style image that includes a desired texture for the synthesized image,determine a localized loss function for a pixel in at least one of the source content image and the source style image, andgenerate the synthesized image by;
optimizing a value of a pixel in the synthesized image to a content loss function of a corresponding pixel in the content source image and a style loss function of a corresponding pixel in the source style image wherein at least one of the corresponding pixels is the pixel that has a determined localized loss function and one of the content loss function and the source loss function is the determined localized loss function.
1 Assignment
0 Petitions
Accused Products
Abstract
Systems and methods for providing convolutional neural network based image synthesis using localized loss functions is disclosed. A first image including desired content and a second image including a desired style are received. The images are analyzed to determine a local loss function. The first and second images are merged using the local loss function to generate an image that includes the desired content presented in the desired style. Similar processes can also be utilized to generate image hybrids and to perform on-model texture synthesis. In a number of embodiments, Condensed Feature Extraction Networks are also generated using a convolutional neural network previously trained to perform image classification, where the Condensed Feature Extraction Networks approximates intermediate neural activations of the convolutional neural network utilized during training.
-
Citations
30 Claims
-
1. A system for generating a synthesized image including desired content presented in a desired style comprising:
-
one or more processors; memory readable by the one or more processors; and instructions stored in the memory that when read by the one or more processors direct the one or more processors to; receive a source content image that includes desired content for a synthesized image, receive a source style image that includes a desired texture for the synthesized image, determine a localized loss function for a pixel in at least one of the source content image and the source style image, and generate the synthesized image by; optimizing a value of a pixel in the synthesized image to a content loss function of a corresponding pixel in the content source image and a style loss function of a corresponding pixel in the source style image wherein at least one of the corresponding pixels is the pixel that has a determined localized loss function and one of the content loss function and the source loss function is the determined localized loss function.
-
-
2. The system of claim 1, wherein the localized loss function is represented by a Gram matrix.
-
3. The system of claim 1, wherein the localized loss function is represented by a covariance matrix.
-
4. The system of claim 1, wherein the localized loss function is determined using a Convolutional Neural Network (CNN).
-
5. The system of claim 4, wherein the optimizing is performed by back propagation through the CNN.
-
6. The system of claim 1, wherein the localized loss function is determined for a pixel in the source style image.
-
7. The system of claim 6, wherein the instructions to determine a localized loss function for a pixel in the source style image direct the one or more processors to:
-
receive a mask that identifies regions of the style source image; determine a group of pixels including the pixel that are included in one of the plurality of regions identified by the mask; determine a localized loss function for the one of the plurality of regions from the groups of pixels included in the one of the plurality of regions; and associate the localized loss function with the pixel.
-
-
8. The system of claim 6, wherein the instructions to determine a localized loss function for a pixel in the source style image direct the one or more processors to:
-
group the pixels of the source style image into a plurality of cells determined by a grid applied to the source style image; determine a localized loss function for the one of the plurality of cells that has a group of pixels that include the pixel; and associate the determined localized loss function of the one of the plurality of cells with the pixel.
-
-
9. The system of claim 6, wherein the instructions to determine a localized loss function for a pixel in the source style image direct the one or more processors to:
-
determine a group of neighbor pixels for a pixel in the source content image; determine a group of corresponding pixels in the source style image associated with the group of neighbor pixels in the source content image wherein each of the group of corresponding pixels corresponds to one of the group of neighbor pixels and includes the pixel; and determine a local loss function for the group of corresponding pixels.
-
-
10. The system of claim 1, wherein the localized loss function is determined for a pixel in the source content image.
-
11. The system of claim 10, wherein the instructions to determine a localized loss function for a pixel in the source content image direct the one or more processors to:
-
receive a mask that identifies regions of the source content image; determine a group of pixels including the pixel that are included in one of the plurality of regions identified by the mask; determine a localized loss function for the one of the plurality of regions from the groups of pixels included in the one of the plurality of regions; and associate the localized loss function with the pixel.
-
-
12. The system of claim 10, wherein the instructions to determine a localized loss function for a pixel in the source style image direct the one or more processors to:
-
group the pixels of the source content image into a plurality of cells determined by a grid applied to the source style image; determine a localized loss function for the one of the plurality of cells that has a group of pixels that include the pixel; and associate the determined localized loss function of the one of the plurality of cells with the pixel.
-
-
13. The system of claim 10, wherein the instructions to determine a localized loss function for a pixel in the source style image direct the one or more processors to:
-
determine a global content loss function for the source content image from the pixels of the source content image; determine a weight for the pixel indicating a contribution to a structure in the source content image; and apply the weight to the global content loss function to determine the localized loss function for the pixel.
-
-
14. The system of claim 13, wherein the weight is determined based upon a Laplacian pyramid of black and white versions of the source content image.
-
15. The system of claim 10, wherein a localized loss function is determined for a pixel in the source content image and a corresponding pixel in the source style image.
-
16. The system of claim 13, wherein the optimization uses the localized loss function for the pixel in the source content image as the content loss function and the localized loss function of the pixel in the source style image as the style loss function.
-
17. The system of claim 1, wherein pixels in the synthesized image begin as white noise.
-
18. The system of claim 1, wherein each pixel in the synthesized image begins with a value equal to a pixel value of a corresponding pixel in the source content image.
-
19. The system of claim 1, wherein the optimizing is performed to minimize to a loss function that includes the content loss function, a style loss function, and a histogram loss function.
-
20. A method for performing style transfer in an image synthesis system where a synthesized image is generated with content from a source content image and texture from a source style image, the method comprising:
-
receiving a source content image that includes desired content for a synthesized image in the image synthesis system; receiving a source style image that includes a desired texture for the synthesized image in the image synthesis system; determining a localized loss function for a pixel in at least one of the source content image and the source style image using the image synthesis system; and generating the synthesized image using the image synthesis system by optimizing a value of a pixel in the synthesized image to a content loss function of a corresponding pixel in the content source image and a style loss function of a corresponding pixel in the source style image wherein at least one of the corresponding pixels is the pixel that has a determined localized loss function and one of the content loss function and the source loss function is the determined localized loss function.
-
-
21. The method of claim 20, wherein the localized loss function is represented by one of a Gram matrix and a covariance matrix.
-
22. The method of claim 20, wherein the localized loss function is determined by the image synthesis system using a Convolutional Neural Network (CNN), wherein the optimizing is performed by the image synthesis system using back propagation through the CNN.
-
23. The method of claim 20, wherein the determining of a localized loss function for a pixel in at least one of the source content image and the source style image comprises:
-
receiving a mask that identifies regions of at least one of the source content image and the source style image using the image synthesis system; determining a group of pixels including the pixel that are included in one of the plurality of regions identified by the mask using the image synthesis system; determining a localized loss function for the one of the plurality of regions from the groups of pixels included in the one of the plurality of regions using the image synthesis system; and associating the localized loss function with the pixel using the image synthesis system.
-
-
24. The method of claim 20, wherein the determining of a localized loss function for a pixel in at least one of the source style image and the source content image comprises:
-
grouping the pixels of at least one of the source content image and the source style image into a plurality of cells determined by a grid applied to the source style image using the image synthesis system; determining a localized loss function for the one of the plurality of cells that has a group of pixels that include the pixel using the image synthesis system; and associating the determined localized loss function of the one of the plurality of cells with the pixel using the image synthesis system.
-
-
25. The method of claim 20, wherein the determining of a localized loss function for a pixel in at least one of the source style image and the source content image comprises:
-
determining a group of neighbor pixels for a pixel in the source content image using the image synthesis system; determining a group of corresponding pixels in the source style image associated with the group of neighbor pixels in the source content image wherein each of the group of corresponding pixels corresponds to one of the group of neighbor pixels and includes the pixel using the image synthesis system; and determining a local loss function for the group of corresponding pixels using the image synthesis system.
-
-
26. The method of claim 20, wherein the determining of a localized loss function for a pixel in at least one of the source style image and the source content image comprises:
-
determining a global content loss function for the source content image from the pixels of the source content image using the image synthesis system; determining a weight for the pixel indicating a contribution to a structure in the source content image using the image synthesis system; and applying the weight to the global content loss function to determine the localized loss function for the pixel using the image synthesis system.
-
-
27. The method of claim 26, wherein the weight is determined based upon a Laplacian Pyramid of black and white versions of the source content image.
-
28. The method of claim 20, wherein a first localized loss function is determined for a pixel in the source content image and a second localized loss function is determined for a corresponding pixel in the source style image.
-
29. The method of claim 28, wherein the optimizing uses the first localized loss function for the pixel in the source content image as the content loss function and the second localized loss function of the pixel in the source style image as the style loss function.
-
30. The method of claim 20, wherein the optimizing is performed to minimize to a loss function that includes at least one of the content loss function, a style loss function, and a histogram loss function.
Specification