Privacy Preserving Machine Learning and Data Governance for AI Systems

Rashi Nimesh Kumar Dhenia; Raghavendra Sridhar; Ishva Jitendrakumar Kanani

doi:10.63282/3050-9262.IJAIDSML-V5I4P121

Authors

Rashi Nimesh Kumar Dhenia Independent Researcher, USA. Author
Raghavendra Sridhar Independent Researcher, USA. Author
Ishva Jitendrakumar Kanani Independent Researcher, USA. Author

DOI:

https://doi.org/10.63282/3050-9262.IJAIDSML-V5I4P121

Keywords:

Preserving Machine Learning (PPML), Cryptographic Techniques, Decentralized Training Paradigms, Data Governance

Abstract

As machine learning permeates sensitive domains such as healthcare, finance, and government, protecting individual privacy while leveraging large-scale data remains a paramount challenge. Privacy-Preserving Machine Learning (PPML) combines cryptographic techniques, decentralized training paradigms, and data governance policies to enable secure and compliant model development. This paper provides a comprehensive survey of fundamental PPML methods differential privacy, federated learning, homomorphic encryption and examines key data governance frameworks underpinning ethical AI adoption. We analyze technical trade-offs, including privacy-utility balance, scalability, and adversarial resilience. Finally, ongoing research directions and policy implications are discussed, emphasizing interdisciplinary collaboration for trustworthy AI deployment.

References

[1] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. arXiv preprint arXiv:2005.14165.

[2] Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. NAACL-HLT.

[3] Dinan, E., Roller, S., Shuster, K., Fan, A., Auli, M., & Weston, J. (2019). Wizard of Wikipedia: knowledge-powered conversational agents. Proceedings of ICLR.

[4] Fan, A., Grangier, D., & Auli, M. (2021). Retrieval-augmented generation for knowledge-intensive NLP tasks. arXiv preprint arXiv:2005.11401.

[5] Guu, K., Lee, K., Tung, Z., Pasupat, P., & Chang, M. (2020). REALM: retrieval-augmented language model pre-training. arXiv preprint arXiv:2002.08909.

[6] Huang, L., Wang, W., Chen, J., & Wei, F. (2020). Hierarchical retrieval-augmented generation for multi-document summarization. Proceedings of EMNLP.

[7] Hu, H., Miller, T., Tian, Y., & Zhang, E. (2019). Multi-hop attention networks for contextualized question answering. arXiv:1909.00423.

[8] Izacard, G., & Grave, E. (2021). Leveraging passage retrieval with generative models for open domain question answering. arXiv preprint arXiv:2007.01282.

[9] Jia, R., Raghunathan, A., & Liang, P. (2020). Adversarial attacks and defenses for question answering. ACL.

[10] Karpukhin, V., Oguz, B., Min, S., Lewis, P., Wu, L., Edunov, S., Chen, D., & Yih, W. (2020). Dense passage retrieval for open-domain question answering. Proceedings of EMNLP.

[11] Kendra, S., Li, M., & Chang, M. (2021). Scaling dense retrieval by approximate nearest neighbor search. SIGIR.

[12] Lewis, P., Oguz, B., Rinott, R., Riedel, S., & Stoyanov, V. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. NeurIPS.

[13] Lee, J., Yoon, W., Kim, S., Kim, D., Kim, S., So, C. H., & Kang, J. (2020). BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics.

[14] Maynez, J., Narayan, S., Bohnet, B., & McDonald, R. (2020). On faithfulness and factuality in abstractive summarization. ACL.

[15] Petroni, F., Rocktäschel, T., Riedel, S., Lewis, P., Bakhtin, A., Wu, Y., & Miller, A. (2019). Language models as knowledge bases? EMNLP.

[16] Shokri, R., & Shmatikov, V. (2015). Privacy-preserving deep learning. ACM CCS.

[17] Thorne, J., Vlachos, A., Christodoulopoulos, C., & Mittal, A. (2018). FEVER: a large-scale dataset for fact extraction and verification. NAACL-HLT.

[18] Raghavendra Sridhar, I. J., & Dhenia, R. N. K. (2021). Dynamic frameworks for enhancing security in digital payment systems. International Journal of Emerging Research in Engineering and Technology, 2(...).

[19] Dhenia, R. N. K. (2020). An analytical study of NoSQL database systems for big data applications. International Journal of Science and Research (IJSR), 9(8), 1616–1619.

[20] Dhenia, I. J. K. Rashi Nimesh Kumar. (2020). Data visualization best practices: enhancing comprehension and decision making with effective visual analytics. International Journal of Science and Research (IJSR), 9(8), 1620–1624.

[21] Dhenia, R. N. K. (2020). Leveraging data analytics to combat pandemics: real-time analytics for public health response. International Journal of Science and Research (IJSR), 9(12), 1945–1947.

[22] Dhenia, R. N. K. (2020). Harnessing big data and NLP for real-time market sentiment analysis across global news and social media. International Journal of Science and Research (IJSR), 9(2), 1974–1977.

[23] Kanani, I. J. K. Rashi Nimesh Kumar, & Sridhar, R. (2021). Intelligent threat detection in cloud environments using data science-driven security analytics. International Journal of Emerging Research in Engineering and Technology, 2(...).

[24] Rashi Nimesh Kumar Dhenia, Ishva Jitendrakumar Kanani, & Sridhar, Raghavendra. (2021). Customer personalization using data science in e-commerce: integrating foundational and emerging research. International Journal of Emerging Research in Engineering and Technology, 2(...).

[25] Kanani, I. J., Sridhar, R., & Dhenia, R. N. K. (2023). Security-centric artificial intelligence: strengthening machine learning systems against emerging threats. International Journal of Artificial Intelligence, Data Science, and Machine Learning.

[26] Dhenia, R. N. K., Kanani, I. J., & Sridhar, R. (2023). Data-centric AI: transforming the future of artificial intelligence and analytics. International Journal of Artificial Intelligence, Data Science, and Machine Learning.

[27] Raghavendra Sridhar, I. J. K., Dhenia, R. N. K., & Kanani, I. J. (2023). A machine learning framework for predictive workload modeling and dynamic cloud resource allocation. International Journal of Artificial Intelligence, Data Science, and Machine Learning.

[28] Kanani, I. J., Raghavendra Sridhar, & Dhenia, R. N. K. (2023). Security-centric artificial intelligence: strengthening machine learning systems against emerging threats. International Journal of Artificial Intelligence and Data Science, .

[29] Dhenia, R. N. K. (2022). Data analytics in construction machinery: applications, challenges and future directions. World Journal of Advanced Research and Reviews, 13(3).

[30] Dhenia, R. N. K. (2022). Text mining and social media analysis for mental health insights. World Journal of Advanced Research and Reviews, 15(3).

[31] Dhenia, R. S. Rashi Nimesh Kumar. (2022). The impact of data bias on decision making. World Journal of Advanced Research and Reviews, 14(3).

[32] Dhenia, R. N. K. (2021). The role of big data analytics in predicting and managing urban traffic flow. International Journal For Multidisciplinary Research, 3(2).

Privacy Preserving Machine Learning and Data Governance for AI Systems

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

How to Cite

call for paper

Make a Submission

Cover Image

CURRENT INDEX

TOOLS

Latest publications

Information