{"id":138,"date":"2019-02-21T14:42:41","date_gmt":"2019-02-21T06:42:41","guid":{"rendered":"https:\/\/cchsu.info\/?p=138"},"modified":"2024-03-04T09:06:03","modified_gmt":"2024-03-04T01:06:03","slug":"detecting-generated-image-based-on-coupled-network-with-two-step-pairwise-learning","status":"publish","type":"post","link":"https:\/\/cchsu.info\/wordpress\/2019\/02\/21\/detecting-generated-image-based-on-coupled-network-with-two-step-pairwise-learning\/","title":{"rendered":"Fake Face Image Detection Project"},"content":{"rendered":"<div id=\"pl-138\"  class=\"panel-layout\" ><div id=\"pg-138-0\"  class=\"panel-grid panel-no-style\" ><div id=\"pgc-138-0-0\"  class=\"panel-grid-cell\" ><div id=\"panel-138-0-0-0\" class=\"so-panel widget widget_siteorigin-panels-builder panel-first-child panel-last-child\" data-index=\"0\" ><div id=\"pl-w5feb526b8aaad\"  class=\"panel-layout\" ><div id=\"pg-w5feb526b8aaad-0\"  class=\"panel-grid panel-no-style\" ><div id=\"pgc-w5feb526b8aaad-0-0\"  class=\"panel-grid-cell\" ><div id=\"panel-w5feb526b8aaad-0-0-0\" class=\"so-panel widget widget_media_image panel-first-child panel-last-child\" data-index=\"0\" ><img loading=\"lazy\" decoding=\"async\" width=\"300\" height=\"165\" src=\"https:\/\/cchsu.info\/wordpress\/wp-content\/uploads\/2020\/12\/NPUSTLogo.svg_-300x165.png\" class=\"image wp-image-641  attachment-medium size-medium\" alt=\"\" style=\"max-width: 100%; height: auto;\" srcset=\"https:\/\/cchsu.info\/wordpress\/wp-content\/uploads\/2020\/12\/NPUSTLogo.svg_-300x165.png 300w, https:\/\/cchsu.info\/wordpress\/wp-content\/uploads\/2020\/12\/NPUSTLogo.svg_-1024x564.png 1024w, https:\/\/cchsu.info\/wordpress\/wp-content\/uploads\/2020\/12\/NPUSTLogo.svg_-768x423.png 768w, https:\/\/cchsu.info\/wordpress\/wp-content\/uploads\/2020\/12\/NPUSTLogo.svg_.png 1200w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/div><\/div><div id=\"pgc-w5feb526b8aaad-0-1\"  class=\"panel-grid-cell\" ><div id=\"panel-w5feb526b8aaad-0-1-0\" class=\"so-panel widget widget_text panel-first-child panel-last-child\" data-index=\"1\" >\t\t\t<div class=\"textwidget\"><p>ACVLAB, Department of Management Information Systems,<\/p>\n<p>National Pingtung University of Science and Technology<\/p>\n<p>Chih-Chung Hsu and Yi-Xiu Zhuang<\/p>\n<p>[<a href=\"https:\/\/github.com\/jesse1029\/Fake-Face-Images-Detection-Tensorflow\">Github<\/a>]<\/p>\n<\/div>\n\t\t<\/div><\/div><\/div><div id=\"pg-w5feb526b8aaad-1\"  class=\"panel-grid panel-no-style\" ><div id=\"pgc-w5feb526b8aaad-1-0\"  class=\"panel-grid-cell\" ><div id=\"panel-w5feb526b8aaad-1-0-0\" class=\"so-panel widget widget_sow-editor panel-first-child panel-last-child\" data-index=\"2\" ><div\n\t\t\t\n\t\t\tclass=\"so-widget-sow-editor so-widget-sow-editor-base\"\n\t\t\t\n\t\t>\n<div class=\"siteorigin-widget-tinymce textwidget\">\n\t<h3><strong>Abstract<\/strong><\/h3>\n<p>With the rapid growth of generative adversarial networks (GANs), a photo-realistic image can be easily generated from a low-dimensional random vector nowadays. However, the generated image can be used to synthesize several persons who may have a potential effect on society with radical contents. Considering that many techniques to produce a photo-realistic facial image based on different GANs are already available, collecting training images of all possible generative models is difficult; hence, the learning-based approach would not effectively detect a fake image generated using an excluded generative model. To overcome this shortcoming, we propose a two-step pairwise learning approach to learn common fake features over the training images generated by using different generative models. First, the triplet loss will be used to simulate the relation between fake and real images and utilized to learn the discriminative features to determine whether an image is real or fake. Then, we propose a novel coupled network to accurately capture local and global image features of the fake or real images. The experimental results demonstrate that the proposed method outperforms the baseline supervised learning methods for fake facial image detection.<\/p>\n<hr \/>\n<p>\u73fe\u5728\u6df1\u5ea6\u5b78\u7fd2\u53ef\u4ee5\u8f15\u6613\u7522\u751f\u903c\u771f\u7167\u7247\uff0c\u4e0d\u7576\u4f7f\u7528\u5f15\u4f86\u53ef\u80fd\u6703\u5f88\u5927\u554f\u984c\uff0c\u4f8b\u5982\u7522\u751f\u9020\u5047\u7684FB\u6216\u5176\u4ed6\u793e\u7fa4\u7db2\u7ad9\u7684\u7167\u7247\u5047\u8cc7\u6599\u3002\u96d6\u7136\u4ed4\u7d30\u770b\u9019\u4e9b\u7167\u7247\u9084\u662f\u53ef\u80fd\u770b\u5230\u7455\u75b5\uff0c\u7136\u800c\u901a\u5e38\u5927\u5bb6\u4e0d\u6703\u5f88\u4ed4\u7d30\u53bb\u6aa2\u67e5\u9019\u4e9b\u7167\u7247\u3002\u672c\u7814\u7a76\u4ee5\u6bd2\u653b\u6bd2\uff0c\u85c9\u7531\u6df1\u5ea6\u5b78\u7fd2\u4f86\u5224\u65b7\u9019\u4e9b\u751f\u6210\u7684\u9020\u5047\u5f71\u50cf\uff0c\u4f86\u89e3\u6c7a\u672a\u4f86\u53ef\u80fd\u6709\u5927\u91cf\u507d\u9020\u4eba\u81c9\u5f71\u50cf\u6d41\u7ac4\u65bc\u793e\u7fa4\u7db2\u8def\u4e0a\u7684\u554f\u984c\u3002<\/p>\n<\/div>\n<\/div><\/div><\/div><\/div><div id=\"pg-w5feb526b8aaad-2\"  class=\"panel-grid panel-no-style\" ><div id=\"pgc-w5feb526b8aaad-2-0\"  class=\"panel-grid-cell\" ><div id=\"panel-w5feb526b8aaad-2-0-0\" class=\"so-panel widget widget_sow-editor panel-first-child panel-last-child\" data-index=\"3\" ><div\n\t\t\t\n\t\t\tclass=\"so-widget-sow-editor so-widget-sow-editor-base\"\n\t\t\t\n\t\t>\n<div class=\"siteorigin-widget-tinymce textwidget\">\n\t<h2><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-147 size-large\" src=\"https:\/\/cchsu.info\/wordpress\/wp-content\/uploads\/2019\/02\/framework-1024x468.png\" alt=\"\" width=\"1024\" height=\"468\" \/><\/h2>\n<h2><strong>Technique \u6838\u5fc3\u6280\u8853<\/strong><\/h2>\n<ul>\n<li>\n<h3>Pairwise Learning<\/h3>\n<\/li>\n<\/ul>\n<p>Since there are many GANs, it is hard to collect all training images from these GANs. Instead of learning the fake features for each GAN, we tend to learn the common fake features from the collected training images.<\/p>\n<p><strong>*Use contrastive loss or triplet loss to learn the common features from the generated images synthesized by several GANs !!<\/strong><\/p>\n<p>GAN\u53ef\u4ee5\u7522\u751f\u5f88\u903c\u771f\u5f71\u50cf\uff0c\u4f46\u5982\u679c\u8981\u8490\u96c6\u6240\u6709GAN\u7684\u507d\u9020\u5f71\u50cf\u592a\u96e3\u4e86\uff0c\u800c\u4e14GAN\u6703\u4e00\u76f4\u66f4\u65b0\uff0c\u4e26\u4e0d\u5be6\u969b\u3002\u900f\u904e\u5b78\u7fd2\u6578\u500bGAN\u7684\u5171\u901a\u507d\u9020\u7279\u5fb5\uff0c\u4f86\u985e\u63a8\u672a\u4f86GAN\u6703\u7522\u751f\u7684\u9020\u5047\u7279\u5fb5\uff0c\u85c9\u6b64\u63d0\u9ad8\u672c\u507d\u9020\u5f71\u50cf\u5075\u6e2c\u5668\u7684\u5f37\u5065\u6027 (Robustness)\u3002<\/p>\n<ul>\n<li>\n<h3>Two-Step Learning<\/h3>\n<\/li>\n<\/ul>\n<p>First, we learn the common fake feature via the proposed pairwise learning.<\/p>\n<p>Second, we adopt a small neural network as the classifier so that we can update both classifier and the common fake features.<\/p>\n<p>\u5169\u968e\u6bb5\u5b78\u7fd2\uff1a\u5148\u5b78\u5171\u540c\u507d\u9020\u7279\u5fb5\uff0c\u518d\u4e32\u63a5\u4e00\u500b\u5c0f\u7684Network\u4f86\u8a13\u7df4\u5206\u985e\u5668\u3002\u5be6\u9a57\u8b49\u660e\u6548\u679c\u6bd4Joint training\u597d\uff01<\/p>\n<ul>\n<li>\n<h3>Coupled Network<\/h3>\n<\/li>\n<\/ul>\n<p>We adopt a two-stream network architecture consisting of the CNN with 3x3 and 5x5 kernels to capture the fake features locally and globally.<\/p>\n<p>\u7528\u4e00\u500b\u96d9\u8def\u67b6\u69cb\u7684CNN\u7db2\u8def (\u5206\u5225\u75313x3 and 5x5\u7684Convolutional kernels\u7d44\u5408\u800c\u6210) \u4f86\u5b78\u7fd2\u8f03\u5c40\u90e8\u6027\u8207\u5168\u57df\u6027\u7684\u507d\u9020\u7279\u5fb5\u3002<\/p>\n<\/div>\n<\/div><\/div><\/div><\/div><div id=\"pg-w5feb526b8aaad-3\"  class=\"panel-grid panel-no-style\" ><div id=\"pgc-w5feb526b8aaad-3-0\"  class=\"panel-grid-cell\" ><div id=\"panel-w5feb526b8aaad-3-0-0\" class=\"so-panel widget widget_sow-editor panel-first-child\" data-index=\"4\" ><div\n\t\t\t\n\t\t\tclass=\"so-widget-sow-editor so-widget-sow-editor-base\"\n\t\t\t\n\t\t>\n<div class=\"siteorigin-widget-tinymce textwidget\">\n\t<h2>Experimental Results<\/h2>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-140 size-large\" src=\"https:\/\/cchsu.info\/wordpress\/wp-content\/uploads\/2019\/02\/experiments-1024x267.png\" alt=\"\" width=\"1024\" height=\"267\" \/><\/p>\n<\/div>\n<\/div><\/div><div id=\"panel-w5feb526b8aaad-3-0-1\" class=\"so-panel widget widget_sow-editor panel-last-child\" data-index=\"5\" ><div\n\t\t\t\n\t\t\tclass=\"so-widget-sow-editor so-widget-sow-editor-base\"\n\t\t\t\n\t\t>\n<div class=\"siteorigin-widget-tinymce textwidget\">\n\t<p>To verify the generalization of the proposed method, we remove the training images generated by one of the GANs as the training set. For example, if we remove the training image generated by PAGAN from the training set, then the trained fake face detector will not learn the fake feature from PGGAN. Afterward, the learned fake face detector is used to detect the test set consisting of the real images and fake images generated by PAGAN. Table I shows the performance comparison between the proposed fake face detector, other baseline methods, and methods in [7][8][15] in terms of accuracy, precision, and recall. The proposed method significantly outperforms other state-of-the-art methods due to the common fake features can be well captured by our CFF and CDNN architecture. It is also verified that the proposed method is more generalized and effective than others.<\/p>\n<p>\u8a13\u7df4\u904e\u7a0b\u4e2d\uff0c\u5f9e\u6536\u96c6\u7684\u516d\u500bGAN\u4e2d\u53bb\u9664\u5176\u4e2d\u4e00\u500bGAN\uff0c\u4e26\u5c07\u8a13\u7df4\u597d\u7684\u6a21\u578b\u62ff\u4f86\u6e2c\u8a66\u8a72GAN\u85c9\u6b64\u6aa2\u9a57\u672c\u65b9\u6cd5\u662f\u5426\u53ef\u4ee5\u63a8\u5ee3\u5230\u672a\u4f86\u5176\u4ed6\u7684GAN\u3002\u5be6\u9a57\u7d50\u679c\u8b49\u660e\u672c\u65b9\u6cd5\u53ef\u4ee5\u5f97\u5230\u6700\u4f73\u7684\u6548\u679c (Precision \/ Recall rate)<\/p>\n<h2>Visualized Results<\/h2>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-142\" src=\"https:\/\/cchsu.info\/wordpress\/wp-content\/uploads\/2019\/02\/visresult-1024x285.png\" alt=\"\" width=\"751\" height=\"209\" \/><\/p>\n<p>Moreover, the proposed model can be used to visualize the\u00a0fake regions of the generated image by extracting the last convolutional\u00a0layer and mapping the responses to the image domain. The visualized feature map for fake regions localization. (a) and (c) are the generated face images by PGGAN\u00a0[1] and DCCAN [11] respectively. (b) and (d) are the localized fake regions for the fake faces of (a) and (b).<\/p>\n<p>\u9664\u4e86\u53ef\u4ee5\u5075\u6e2c\u5f71\u50cf\u672c\u8eab\u662f\u507d\u9020\u9084\u662f\u771f\u5be6\u5f71\u50cf\u4e4b\u5916\uff0c\u4ea6\u53ef\u900f\u904eFeature visualization engineering\u7684\u65b9\u5f0f\u4f86\u7372\u5f97\u4eba\u81c9\u507d\u9020\u5340\u57df\u8cc7\u8a0a\u3002<\/p>\n<h2>References<\/h2>\n<ul>\n<li>Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen, \u201cProgressive growing of gans for improved quality, stability, and variation,\u201d <em>arXiv preprint <\/em>arXiv:1710.10196, 2017.<\/li>\n<li>Andrew Brock, Jeff Donahue, and Karen Simonyan,\u201cLarge scale gan training for high fidelity natural image synthesis,\u201d <em>arXiv preprint, <\/em>arXiv:1809.11096, 2018.<\/li>\n<li>Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros, \u201cUnpaired image-to-image translation using cycle-consistent adversarial networks,\u201d arXiv preprint, 2017.<\/li>\n<li>\u201cAi can now create fake porn, making revenge porn even more complicated,\u201d in http:\/\/theconversation.com\/aican-now-create-fake-porn-making-revenge-porn-evenmore-complicated-92267, 2018.<\/li>\n<li>Hany Farid, \u201cImage forgery detection,\u201d <em>IEEE Signal processing magazine<\/em>, vol. 26, no. 2, pp. 16\u201325, 2009.<\/li>\n<li>Chih-Chung Hsu, Tzu-Yi Hung, Chia-Wen Lin, and Chiou-Ting Hsu, \u201cVideo forgery detection using correlation of noise residue,\u201d in Proc. of <em>IEEE Workshop on Multimedia Signal Processing,<\/em> 170\u2013174, 2008.<\/li>\n<li>Bolin Chen Huaxiao Mo and Weiqi Luo, \u201cFake faces identification via convolutional neural network,\u201d in <em>Proceedings of the ACM Workshop on Information Hiding and Multimedia Security<\/em>, 2018, pp. 43\u201347.<\/li>\n<li>Marra, D. Gragnaniello, D. Cozzolino, and L. Verdoliva, \u201cDetection of gan-generated fake images over social networks,\u201d in Proc. of <em>IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)<\/em>, pp. 384\u2013389, April 2018,<\/li>\n<li>Franc\u00b8ois Chollet, \u201cXception: Deep learning with depthwise separable convolutions,\u201d <em>arXiv preprint<\/em>, pp. 1610\u201302357, 2017.<\/li>\n<li>Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio, \u201cGenerative adversarial nets,\u201d in Proc. of <em>Advances in Neural Information Processing Systems,<\/em> 2014, pp. 2672\u20132680.<\/li>\n<li>Alec Radford, Luke Metz, and Soumith Chintala, \u201cUnsupervised representation learning with deep convolutional generative adversarial networks,\u201d<em> arXiv preprint<\/em>, arXiv:1511.06434, 2015.<\/li>\n<li>Martin Arjovsky, Soumith Chintala, and L\u00b4eon Bottou,\u201cWasserstein gan,\u201d <em>arXiv preprint<\/em>, arXiv:1701.07875, 2017.<\/li>\n<li>Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron C Courville, \u201cImproved training of wasserstein gans,\u201d in Proc. of <em>Advances in Neural Information Processing Systems<\/em>, pp. 5767\u20135777, 2017.<\/li>\n<li>Xudong Mao, Qing Li, Haoran Xie, Raymond YK Lau, Zhen Wang, and Stephen Paul Smolley, \u201cLeast squares generative adversarial networks,\u201d in Proc. of <em>IEEE International Conference on Computer Vision (ICCV)<\/em>, pp. 2813\u20132821, 2017.<\/li>\n<li>Chia-Yen Lee, Chih-Chung Hsu and Yi-Xiu Zhuang, \u201cLearning to detect fake face images in the wild,\u201d <em>arXiv preprint<\/em>, arXiv:1809.08754, 2018.<\/li>\n<li>Hadsell S. Chopra and Y. LeCun, \u201cLearning a similarity metric discriminatively, with application to face verification,\u201d in Proc. of the <em>IEEE Conference on Computer Vision and Pattern Recognition<\/em> <em>(CVPR)<\/em>, 2005, vol. 1, pp. 539\u2013546.<\/li>\n<li>Hoffer and N. Ailon, \u201cDeep metric learning using triplet network,\u201d in Proc. of <em>International Workshop on Similarity-Based Pattern Recognition<\/em>, pp. 84\u201392, 2015.<\/li>\n<li>Yann LeCun, Bernhard E Boser, John S Denker, Donnie Henderson, Richard E Howard, Wayne E Hubbard, and Lawrence D Jackel, \u201cHandwritten digit recognition with a back-propagation network,\u201d in Proc. of <em>Advances in Neural Information Processing Systems<\/em>, pp. 396\u2013404, 1990.<\/li>\n<li>Wang Z.i Liu, P. Luo and X. Tang, \u201cDeep learning face attributes in the wild,\u201d in Proc. of<em> International Conference on Computer Vision (ICCV)<\/em>, 2015.<\/li>\n<li>Ilya Sutskever, James Martens, George Dahl, and Geoffrey Hinton, \u201cOn the importance of initialization and momentum in deep learning,\u201d in Proc. of <em>International conference on machine learning<\/em>, pp. 1139\u20131147, 2013.<\/li>\n<li>Laptev M. Oquab, L. Bottou and J. Sivic, \u201cIs object localization for free?-weakly-supervised learning with convolutional neural networks,\u201d in Proc. of the <em>IEEE Conference on Computer Vision and Pattern Recognition<\/em> (CVPR), pp. 685\u2013694, 2015.<\/li>\n<li>Kaiming He, et al., \u201cDeep residual learning for image recognition,\u201d In Proc. of <em>the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)<\/em>, 2016.<\/li>\n<li>Gao Huang, et al., \u201cDensely connected convolutional networks,\u201d In Proc. of <em>the IEEE Conference on Computer Vision and Pattern Recognition (CVPR),<\/em><\/li>\n<\/ul>\n<\/div>\n<\/div><\/div><\/div><\/div><\/div><\/div><\/div><\/div><\/div>","protected":false},"excerpt":{"rendered":"<p>ACVLAB, Department of Management Information Systems, N [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-138","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/cchsu.info\/wordpress\/wp-json\/wp\/v2\/posts\/138","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cchsu.info\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cchsu.info\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cchsu.info\/wordpress\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/cchsu.info\/wordpress\/wp-json\/wp\/v2\/comments?post=138"}],"version-history":[{"count":2,"href":"https:\/\/cchsu.info\/wordpress\/wp-json\/wp\/v2\/posts\/138\/revisions"}],"predecessor-version":[{"id":664,"href":"https:\/\/cchsu.info\/wordpress\/wp-json\/wp\/v2\/posts\/138\/revisions\/664"}],"wp:attachment":[{"href":"https:\/\/cchsu.info\/wordpress\/wp-json\/wp\/v2\/media?parent=138"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cchsu.info\/wordpress\/wp-json\/wp\/v2\/categories?post=138"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cchsu.info\/wordpress\/wp-json\/wp\/v2\/tags?post=138"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}