Personalized federated learning with model interpolation among client clusters and its application in smart home (pdf)

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007/s11280-022-01132-0.pdf

Personalized federated learning with model interpolation among client clusters and its application in smart home

World Wide Web https://doi.org/10.1007/s11280-022-01132-0 Personalized federated learning with model interpolation among client clusters and its application in smart home Zhikai Yang1 · Yaping Liu1 · Shuo Zhang1 · Keshen Zhou2 Received: 17 October 2022 / Revised: 27 November 2022 / Accepted: 8 December 2022 © The Author(s) 2023 Abstract The proliferation of high-performance personal devices and the widespread deployment of machine learning (ML) applications have led to two consequences: the volume of private data from individuals or groups has exploded over the past few years; and the traditional central servers for training ML models have experienced communication and performance bottlenecks in the face of massive amounts of data. However, this reality also provides the possibility of keeping data local for ML training and fusing models on a broader scale. As a new branch of ML application, Federated Learning (FL) aims to solve the problem of multi-party joint learning on the premise of protecting personal data privacy. However, due to the heterogeneity of devices, including network connection, network bandwidth, computing resources, etc., it is unrealistic to train, update and aggregate models in all devices in parallel, while personal data is often not independent and identically distributed (NonIID) due to multiple reasons. This reality poses a challenge to the speed and convergence of FL. In this paper, we propose the pFedCAM algorithm, which aims to improve the robustness of the FL system to device heterogeneity and Non-IID data, while achieving some degree of federation model personalization. pFedCAM is based on the idea of clustering and model interpolation by classifying heterogeneous clients and performing FedAvg algorithm in parallel, and then combining them into personalized federated global models by inter-cluster model interpolation. Experiments show that the accuracy of pFedCAM improves 10.3% on Fashion-MNIST and 11.3% on CIFAR-10 compared to the benchmark in the case of Non-IID data. In the end, we applied pFedCAM in HomeProtect, a smart home privacy protection framework we designed, and achieved good practical results in the case of flame recognition. * Yaping Liu * Shuo Zhang Zhikai Yang Keshen Zhou 1 Cyberspace Institute of Advanced Technology, Guangzhou University, Guangzhou, China 2 School of Computer Science, The University of Sydney, Sydney, Australia 13 Vol.:(0123456789) World Wide Web Keywords Federated Learning · Personalization · Clustering · Model interpolation 1 Introduction In the past few years, a large number of devices with different computing capabilities have been put into the market, such as mobile devices like smartphones, Internet of Things (IOT) devices, and smart cars. These devices have generated a large amount of data due to their extensive and long-term use. These data are very attractive for data-driven machine learning (ML), and will contribute to the training of ML models. However, the traditional way of centralized ML is to upload personal data to a central server for model training, which will compromise the privacy of individuals. The EU introduced the General Data Protection Regulation (GDPR) in 2018, which is a privacy protection regulation designed to set out the rules that companies should follow when collecting, processing and using users’ data. With the gradual implementation of privacy protection policies in various countries and the gradual awakening of people’s awareness of privacy protection, the method of collecting data, uploading it to servers and training it no longer applies. Google provides us with an effective distributed ML paradigm. In 2016, Google [1] proposed the concept of Federated Learning (FL) and successfully applied it to Google keyboard [2], providing a powerful tool to break the barrier of data silos. With FL, instead of uploading data, the data owner will upload the ML models obtained using local computing resources to the server, which will aggregate the models. Because of the privacy-sensitive data protection feature, FL is widely used in the field of privacy-preserved ML, such as financial lending, medical diagnosis [3, 4], etc. If it is based on existing blockchain technologies and applications, such as data auditing [5] and energy dispatching [6], FL will have a broader application prospect and its privacy protection features will be strengthened. However, unlike distributed ML based on server environment, FL is built on a more complex device environment and it faces some fundamental challenges. Since the devices of individuals or groups participating in the FL system have different computing resources, network bandwidth and network connectivity, and the availability of these devices are not stable at all times due to non-hardware factors such as usage habits, it is very difficult to design synchronous or semi-synchronous protocols as in the case of traditional distributed ML. For these reasons, it is a common strategy to select some clients but not all to participate in training in order to avoid the FL system from getting into long-time waiting due to Some devices being offline or unstable network conditions. McMahan et al. [1] proposed FedAvg algorithm, which randomly selects a certain proportion of clients to upload model weights at the end of each round of local training, and then the server averages the weights. FedCS [7] selects the appropriate clients by measuring the client resources, and accommodates as many clients as possible to participate in the aggregation without entering a long wait. In addition to the device heterogeneity challenge, FL also faces the statistical heterogeneity challenge. Most of the existing FL algorithms do not consider the statistical challenges posed by heterogeneous local datasets in a global sense. Due to the heterogeneity of devices and different user usage patterns, individual data may have attribute skew or label skew, that is, data from different clients may not come from the same global distribution, and the model trained by selecting from part of clients may not reflect the overall data distribution, leading to the introduction of unavoidable bias in the update of the global model. Device heterogeneity and statistical heterogeneity cause the problem of not independent and identically distributed (Non-IID) data in FL. Several studies [8–10] have shown that in 13 World Wide Web the case of Non-IID data, there is a significant decrease in model convergence speed and accuracy with the resulting increase in the number of communication rounds in FL. Considering the FedAvg algorithm under Non-IID data, since clients use Non-IID data in local training, the variation between models trained by different clients is too large, resulting in slowing down the global model convergence speed and significantly reducing the model accuracy in the model aggregation phase. The generation of device heterogeneity and Non-IID data problems on the (...truncated)