Adversarial Attacks and Defenses in Deep Neural Networks

Authors

  • Sunil Anasuri Independent Researcher, USA. Author

DOI:

https://doi.org/10.63282/3050-9262.IJAIDSML-V3I4P109

Keywords:

Adversarial Attacks, Deep Neural Networks, FGSM, PGD, Adversarial Training, Defensive Distillation, Cybersecurity, AI Robustness, Gradient Masking, Transferability

Abstract

The Deep Neural Networks (DNNs) have transformed a lot of fields such as computer vision, speech recognition, and natural language processing. Nevertheless, they are notoriously susceptible to adversarial attacks malicious inputs that can trick DNNs into giving wrong predictions that are imperceptibly wrong to a human observer. This weakness is of major concern, particularly in safety-sensitive exertions, like autonomous driving, medical diagnosis, and biometric verification. Today, this paper will discuss the adversarial attack ground and related defense mechanisms . An overview of adversarial attacks The overview of adversarial attacks covers why adversarial attacks? which includes, white-box attack, black-box attack, and transfer-based attack with their respective mechanisms composed of Fast Gradient Sign Method (FGSM), Projected Gradient Descent (PGD), Carlini-Wagner (CW), and DeepFool mechanisms. Subsequently, we observe the scope of defenses adversarial training, defensive distillation, input preprocessing, gradient masking. A comprehensive literature review presents historical evolutions and achievements in both attack generation and policies of mitigation. Methodology In the methodology section, an approach to testing adversarial robustness is presented with a standardized framework in terms of reproducibility and benchmark datasets like MNIST, CIFAR-10, ImageNet. We provide the outcomes of the comparative experiments in diverse threat models as well as implications of the research. Lastly, the paper gives a glimpse of the future of adversarial research and the importance of adaptable, strong, and interpretable models. The knowledge generated through our contribution synthesizes the preexisting ones and provides a basis on which strict and safe DNN systems can be developed

References

[1] Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. (2013). Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199.

[2] Ozdag, M. (2018). Adversarial attacks and defenses against deep neural networks: a survey. Procedia Computer Science, 140, 152-161.

[3] Goodfellow, I. J., Shlens, J., & Szegedy, C. (2014). Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572.

[4] Tirumala, S. S., Ali, S., & Ramesh, C. P. (2016, August). Evolving deep neural networks: A new prospect. In 2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD) (pp. 69-74). IEEE.

[5] Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, Z. B., & Swami, A. (2016, March). The limitations of deep learning in adversarial settings. In 2016 IEEE European symposium on security and privacy (EuroS&P) (pp. 372-387). IEEE.

[6] Madry, A., Makelov, A., Schmidt, L., Tsipras, D., & Vladu, A. (2017). Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083.

[7] Papernot, N., McDaniel, P., Wu, X., Jha, S., & Swami, A. (2016, May). Distillation as a defense to adversarial perturbations against deep neural networks. In 2016 IEEE symposium on security and privacy (SP) (pp. 582-597). IEEE.

[8] Zhang, Z., & Gupta, B. B. (2018). Social media security and trustworthiness: overview and new direction. Future Generation Computer Systems, 86, 914-925.

[9] Xu, W., Evans, D., & Qi, Y. (2017). Feature squeezing: Detecting adversarial examples in deep neural networks. arXiv preprint arXiv:1704.01155.

[10] Gu, S., & Rigazio, L. (2014). Towards deep neural network architectures robust to adversarial examples. arXiv preprint arXiv:1412.5068.

[11] Carlini, N., & Wagner, D. (2017, November). Adversarial examples are not easily detected: Bypassing ten detection methods. In Proceedings of the 10th ACM workshop on artificial intelligence and security (pp. 3-14).

[12] Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., & McDaniel, P. (2017). Ensemble adversarial training: Attacks and defenses. arXiv preprint arXiv:1705.07204.

[13] Kurakin, A., Goodfellow, I., & Bengio, S. (2016). Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236.

[14] Moosavi-Dezfooli, S. M., Fawzi, A., & Frossard, P. (2016). Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2574-2582).

[15] Miller, D. J., Xiang, Z., & Kesidis, G. (2020). Adversarial learning targeting deep neural network classification: A comprehensive review of defenses against attacks. Proceedings of the IEEE, 108(3), 402-433.

[16] Athalye, A., Carlini, N., & Wagner, D. (2018, July). Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In International conference on machine learning (pp. 274-283). PMLR.

[17] Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images.

[18] Bhambri, S., Muku, S., Tulasi, A., & Buduru, A. B. (2019). A survey of black-box adversarial attacks on computer vision models. arXiv preprint arXiv:1912.01667.

[19] Dodge, S., & Karam, L. (2017, July). A study and comparison of human and deep learning recognition performance under visual distortions. In 2017 26th international conference on computer communication and networks (ICCCN) (pp. 1-7). IEEE.

[20] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (2002). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.

[21] Pappula, K. K., & Anasuri, S. (2020). A Domain-Specific Language for Automating Feature-Based Part Creation in Parametric CAD. International Journal of Emerging Research in Engineering and Technology, 1(3), 35-44. https://doi.org/10.63282/3050-922X.IJERET-V1I3P105

[22] Rahul, N. (2020). Optimizing Claims Reserves and Payments with AI: Predictive Models for Financial Accuracy. International Journal of Emerging Trends in Computer Science and Information Technology, 1(3), 46-55. https://doi.org/10.63282/3050-9246.IJETCSIT-V1I3P106

[23] Enjam, G. R. (2020). Ransomware Resilience and Recovery Planning for Insurance Infrastructure. International Journal of AI, BigData, Computational and Management Studies, 1(4), 29-37. https://doi.org/10.63282/3050-9416.IJAIBDCMS-V1I4P104

[24] Pappula, K. K., Anasuri, S., & Rusum, G. P. (2021). Building Observability into Full-Stack Systems: Metrics That Matter. International Journal of Emerging Research in Engineering and Technology, 2(4), 48-58. https://doi.org/10.63282/3050-922X.IJERET-V2I4P106

[25] Rahul, N. (2021). Strengthening Fraud Prevention with AI in P&C Insurance: Enhancing Cyber Resilience. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 2(1), 43-53. https://doi.org/10.63282/3050-9262.IJAIDSML-V2I1P106

[26] Enjam, G. R. (2021). Data Privacy & Encryption Practices in Cloud-Based Guidewire Deployments. International Journal of AI, BigData, Computational and Management Studies, 2(3), 64-73. https://doi.org/10.63282/3050-9416.IJAIBDCMS-V2I3P108

Published

2022-12-30

Issue

Section

Articles

How to Cite

1.
Anasuri S. Adversarial Attacks and Defenses in Deep Neural Networks. IJAIDSML [Internet]. 2022 Dec. 30 [cited 2025 Oct. 6];3(4):77-85. Available from: https://ijaidsml.org/index.php/ijaidsml/article/view/249