Training AI Models on Sensitive Data - The Federated Learning Approach
DOI:
https://doi.org/10.63282/3050-9262.IJAIDSML-V1I2P104Keywords:
Federated Learning, Sensitive Data, Artificial Intelligence, Privacy, Decentralized Machine Learning, Data Security, Regulatory Compliance, Data Privacy, Machine Learning, AI Models, Privacy Preservation, Secure Data Sharing, Compliance Standards, Data Governance, Privacy-Enhancing TechnologiesAbstract
As AI becomes increasingly common in many other fields, training AI models on sensitive information opens up both opportunities & worries. Traditional ways of training AI models rely on their centralized systems, where huge volumes of information are gathered and processed on a single server. This plan is possible, but it raises a lot of privacy & their security issues, especially for private or their sensitive information. Federated Learning (FL) is a good way to solve these problems since it lets AI models be trained on their information from several places without having to submit more sensitive information to a central location. This decentralized plan keeps data private by keeping it close to where it originated from. Federated Learning doesn't use raw information; instead, it combines model updates from many other different places. This retains the information where it is, which minimizes the danger of their data breaches & makes it more likely that people will follow severe data protection rules like GDPR. This paper talks about the basic ideas of Federated Learning, such as its structure, key parts & how important secure aggregation methods are for keeping people's identities secret. It also highlights the growing number of places where federated learning may be used, such as healthcare, banking & mobile devices, where data privacy is very important. The paper talks about the pros of federated learning (FL), such as better privacy, less bandwidth use & better model performance through collaborative learning. It also talks about the cons, such as problems with communication, model synchronization & the difficulties of implementing FL on a huge scale
References
[1] Hao, M., Li, H., Luo, X., Xu, G., Yang, H., & Liu, S. (2019). Efficient and privacy-enhanced federated learning for industrial artificial intelligence. IEEE Transactions on Industrial Informatics, 16(10), 6532-6542.
[2] Truex, S., Baracaldo, N., Anwar, A., Steinke, T., Ludwig, H., Zhang, R., & Zhou, Y. (2019, November). A hybrid approach to privacy-preserving federated learning. In Proceedings of the 12th ACM workshop on artificial intelligence and security (pp. 1-11).
[3] Allam, Hitesh. Exploring the Algorithms for Automatic Image Retrieval Using Sketches. Diss. Missouri Western State University, 2017.
[4] Patel, Piyushkumar, and Disha Patel. "Blockchain’s Potential for Real-Time Financial Auditing: Disrupting Traditional Assurance Practices." Distributed Learning and Broad Applications in Scientific Research 5 (2019): 1468-84.
[5] Yang, Q., Liu, Y., Chen, T., & Tong, Y. (2019). Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST), 10(2), 1-19.
[6] Shaik, Babulal. "Network Isolation Techniques in Multi-Tenant EKS Clusters." Distributed Learning and Broad Applications in Scientific Research 6 (2020).
[7] Bhagoji, A. N., Chakraborty, S., Mittal, P., & Calo, S. (2019, May). Analyzing federated learning through an adversarial lens. In International conference on machine learning (pp. 634-643). PMLR.
[8] Manda, Jeevan Kumar. "Cloud Security Best Practices for Telecom Providers: Developing comprehensive cloud security frameworks and best practices for telecom service delivery and operations, drawing on your cloud security expertise." Available at SSRN 5003526 (2020).
[9] Wang, Z., Song, M., Zhang, Z., Song, Y., Wang, Q., & Qi, H. (2019, April). Beyond inferring class representatives: User-level privacy leakage from federated learning. In IEEE INFOCOM 2019-IEEE conference on computer communications (pp. 2512-2520). IEEE.
[10] Jani, Parth. "Modernizing Claims Adjudication Systems with NoSQL and Apache Hive in Medicaid Expansion Programs." JOURNAL OF RECENT TRENDS IN COMPUTER SCIENCE AND ENGINEERING (JRTCSE) 7.1 (2019): 105-121.
[11] Li, D., & Wang, J. (2019). Fedmd: Heterogenous federated learning via model distillation. arXiv preprint arXiv:1910.03581.
[12] Immaneni, J. (2020). Building MLOps Pipelines in Fintech: Keeping Up with Continuous Machine Learning. International Journal of Artificial Intelligence, Data Science, and Machine Learning, 1(2), 22-32.
[13] Veluru, Sai Prasad. "Threat Modeling in Large-Scale Distributed Systems." International Journal of Emerging Research in Engineering and Technology 1.4 (2020): 28-37.
[14] Hard, A., Rao, K., Mathews, R., Ramaswamy, S., Beaufays, F., Augenstein, S., ... & Ramage, D. (2018). Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604.
[15] Immaneni, J., & Salamkar, M. (2020). Cloud migration for fintech: how kubernetes enables multi-cloud success. International Journal of Emerging Trends in Computer Science and Information Technology, 1(3), 17-28.
[16] Brisimi, T. S., Chen, R., Mela, T., Olshevsky, A., Paschalidis, I. C., & Shi, W. (2018). Federated learning of predictive models from federated electronic health records. International journal of medical informatics, 112, 59-67.
[17] Nookala, G. (2020). Automation of privileged access control as part of enterprise control procedure. Journal of Big Data and Smart Systems, 1(1).
[18] Bonawitz, K. (2019). Towards federated learning at scale: Syste m design. arXiv preprint arXiv:1902.01046.
[19] Jani, Parth. "UM Decision Automation Using PEGA and Machine Learning for Preauthorization Claims." The Distributed Learning and Broad Applications in Scientific Research 6 (2020): 1177-1205.
[20] Nishio, T., & Yonetani, R. (2019, May). Client selection for federated learning with heterogeneous resources in mobile edge. In ICC 2019-2019 IEEE international conference on communications (ICC) (pp. 1-7). IEEE.
[21] Manda, J. K. "Implementing blockchain technology to enhance transparency and security in telecom billing processes and fraud prevention mechanisms, reflecting your blockchain and telecom industry insights."Adv Comput Sci 1.1 (2018).
[22] Yang, T., Andrew, G., Eichner, H., Sun, H., Li, W., Kong, N., ... & Beaufays, F. (2018). Applied federated learning: Improving google keyboard query suggestions. arXiv preprint arXiv:1812.02903.
[23] Sai Prasad Veluru. “Hybrid Cloud-Edge Data Pipelines: Balancing Latency, Cost, and Scalability for AI”. JOURNAL OF RECENT TRENDS IN COMPUTER SCIENCE AND ENGINEERING ( JRTCSE), vol. 7, no. 2, Aug. 2019, pp. 109–125
[24] Arugula, Balkishan, and Sudhkar Gade. “Cross-Border Banking Technology Integration: Overcoming Regulatory and Technical Challenges”. International Journal of Emerging Research in Engineering and Technology, vol. 1, no. 1, Mar. 2020, pp. 40-48
[25] Wang, X., Han, Y., Wang, C., Zhao, Q., Chen, X., & Chen, M. (2019). In-edge ai: Intelligentizing mobile edge computing, caching and communication by federated learning. Ieee Network, 33(5), 156-165.
[26] Mohammad, Abdul Jabbar. “Sentiment-Driven Scheduling Optimizer”. International Journal of Emerging Research in Engineering and Technology, vol. 1, no. 2, June 2020, pp. 50-59
[27] Patel, Piyushkumar. "The Evolution of Revenue Recognition Under ASC 606: Lessons Learned and Industry-Specific Challenges." Distributed Learning and Broad Applications in Scientific Research 5 (2019): 1485-98.
[28] Geyer, R. C., Klein, T., & Nabi, M. (2017). Differentially private federated learning: A client level perspective. arXiv preprint arXiv:1712.07557.
[29] Manda, Jeevan Kumar. "AI And Machine Learning In Network Automation: Harnessing AI and Machine Learning Technologies to Automate Network Management Tasks and Enhance Operational Efficiency in Telecom, Based On Your Proficiency in AI-Driven Automation Initiatives." Educational Research (IJMCER) 1.4 (2019): 48-58.
[30] Jiang, Y., Konečný, J., Rush, K., & Kannan, S. (2019). Improving federated learning personalization via model agnostic meta learning. arXiv preprint arXiv:1909.12488.
[31] Lu, Y., Huang, X., Dai, Y., Maharjan, S., & Zhang, Y. (2019). Blockchain and federated learning for privacy-preserved data sharing in industrial IoT. IEEE Transactions on Industrial Informatics, 16(6), 4177-4186.