Targeted style transfer using cycle consistent generative adversarial networks with quantitative analysis of different loss functions 1

Abstract

Targeted style transfer is the visual computing and deep learning problem where the input and target image sets are used to train the network by learning the mapping between those for conversion of the input image to the style of the target image. One of the popular methods for this task is Cycle-GANs (Cycle Consistent Generative Adversarial Networks), with Mean Squared Error, Binary Cross Entropy Error, and L1 loss functions. In this paper, our network is trained for image-to-image translation where the style or content of the Target image is changed by the network by modifying loss functions of Cycle GANs. Most accurate translation could be trained to the network through the use of paired images i.e. Supervised Learning where the input image and output images are known and thus, the network learns to minimize the gap between the expected output and observed output. However, this kind of paired data is not readily available and is strenuous to mass produce. Cycle GANs uses unpaired data, and our work is dedicated to finding the best possible loss function combination for making it even more efficient.

In Cycle GANs, there is a combination of 2 networks: Discriminators and Generators for each data set, which compete against each other to out-perform the other. Discriminator network uses Classification loss functions for distinguishing the images for the 2 datasets, while the Generator network uses Regression loss functions for determining Cycle loss and Identity loss. These loss functions play a vital role in the style transfer as they determine how much the images have been modified. We have worked on various loss functions like Mean Square Error loss, Binary Cross Entropy Error loss, Hinge loss, Huber loss, Log loss, Square loss and L1 loss for experimentation for the best losses combination to be used. We discuss the strengths and limitations of the loss functions already used and propose different combinations of loss functions for better accuracy. A separate classifier was trained extensively for performance evaluation purpose, which gives the most optimal combination of loss functions which is Binary Cross Entropy loss for Classification loss function and Huber loss for Regression loss function.

Keywords

Cycle GANS image-to-image translation deep learning loss functions neural networks visual computing

Get full access to this article

View all access options for this article.

References

Ghosh

Kumar

and Sastry

P.S.

, robust loss functions under label noise for deep neural networks, in: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, 2017, pp. 1919–1925.

Castillo

Han

Singh

Yadav

A.K.

and Goldstein

, Son of Zorn’s lemma: Targeted style transfer using instance-aware semantic segmentation, in: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Louisiana, 2017, pp. 1348–1352.

Kingma

and Ba

, Adam, a method for stochastic optimization, in: Proceedings of the 3𝑟𝑑 International Conference for Learning Representations, San Diego, 2015, pp. 1–15.

D.N.

Nguyen

G.N.

Bhateja

and Satapathy

S.C.

, Optimizing feature selection in video-based recognition using Max-Min Ant System for the online video contextual advertisement user-oriented system, Journal of Computer Science 21(1) (2017), 361–370.

Csurka

Larlus

Perronnin

and Meylan

, What is a good evaluation measure for semantic segmentation? The British Machine Vision Association and Society for Pattern Recognition 27(1) 2013, 1–11.

Johnson

Alahi

and Fei-Fei

, Perceptual losses for real-time style transfer and super-resolution, in: Proceedings of European Conference on Computer Vision, Netherlands, 2016, pp. 694–711.

Zhu

J.Y.

Park

Isola

and Efros

, Unpaired image-to-image translation using cycle-consistent adversarial networks, in: Proceedings of the IEEE International Conference on Computer Vision, Italy, 2017, pp. 2242–2251.

Gatys

L.A.

Ecker

A.S.

and Bethge

, A neural algorithm of artistic style, arXiv preprint arXiv:1508.06576, 2015.

Berrada

Zisserman

and Kumar

M.P.

, Smooth loss functions for deep top-k classification, in: Proceedings of the Sixth International Conference on Learning Representations, Canada, 2018, pp. 1–25.

10.

Sokolova

and Lapalme

, A systematic analysis of performance measures for classification tasks, Information Processing and Management 45(4) (2009), 427–437.

11.

Soundrapandiyan

and Mouli

P.C.

, Adaptive pedestrian detection in infrared images using fuzzy enhancement and top-hat transform, International Journal of Computational Vision and Robotics 7(1/2) (2017), 49–67.

12.

Soundrapandiyan

and Mouli

P.C.

, Adaptive pedestrian detection in infrared images using background subtraction and local thresholding, Procedia Computer Science 58(1) (2015), 706–713.

13.

Satapathy

S.C.

El-Maleh

and Bhateja

, Intelligent computing in multidisciplinary engineering applications, Arabian Journal of Science and Engineering 43(8) (2018), 3861–3862.

14.

Sabour

Frosst

and Hinton

G.E.

, Dynamic routing between capsules, in: Proceedings of the Advances in Neural Information Processing Systems, California, 2017, pp. 3859–3869.

15.

LeCun

Chopra

Hadsell

Ranzato

and Huang

, A tutorial on energy-based learning, Predicting Structured Data 1 (2006), 1–59.

16.

http://image-net.org/download-images.