Sage Journals: Discover world-class research

Abstract

Artificial intelligence (AI) systems have the potential to significantly enhance various aspects of human life, including healthcare, employment, energy management, finance, and creative endeavors. However, the rise of larger, computationally intensive AI models has also led to concerns about their negative societal impacts, such as increasing economic concentration, higher energy emissions, and threats to data privacy. In this paper, we examine the societal implications of a promising machine learning approach called federated learning. We begin by reviewing ongoing research challenges, noting that federated learning is often promoted as a potential solution to mitigate issues of economic concentration, environmental harm, and privacy concerns associated with AI. While federated learning systems hold significant promise, they are currently at an early stage of development and face critical technical challenges, including data leakage risks and high communication costs, which need to be addressed before widespread adoption can occur.

Keywords

Federated learning data privacy societal impact artificial intelligence decentralization equity environmental sustainability

Artificial intelligence and its societal concerns

The development of deep learning (LeCun et al., 2015) has driven incredible advancements in artificial intelligence (AI) systems. Prominent examples include AlphaGo (Silver et al., 2016), which defeated the Go world champion; AlphaFold (Jumper et al., 2021), which solved the protein folding problem in biology; and ChatGPT (OpenAI, 2022), which became the fastest application to reach 100 million users (Hu, 2023). These more advanced AI systems are in turn expected to drive improvements in medicine, hiring, energy, finance, writing, and nearly every aspect of life. Yet AI progress has also sparked concerns about possible adverse effects on economic concentration, energy emissions, and data privacy. Popular image generators like StableDiffusion (Rombach et al., 2022) can create economic concentration by displacing millions of visual artists with much cheaper machine learning (ML) models owned by large tech companies. Advanced AI models can generate harmful emissions by consuming large amounts of energy during the computationally intensive training process (Luccioni et al., 2022). Finally, these models incentivize companies to infringe on users’ privacy because the training process requires very large datasets collected from users, often without their explicit consent or knowledge (Paullada et al., 2021). OpenAI's prominent large language model (LLM), GPT-3, for example, was trained on a combination of public and private text data taken from books, Wikipedia, and a crawl of the internet (Brown et al., 2020).

Policymakers around the world have sought to address these negative consequences of widespread AI adoption (Aho and Duffield, 2020; Pernot-Leplay, 2020), but many of their proposed solutions are either technically difficult or impossible to implement (Kuru and de Miguel Beriain, 2022). Because of these limitations, AI researchers have sought new methods that can balance building powerful models without creating significant societal harm. Federated learning is one such promising method for addressing economic concentration, increased energy emissions, and privacy violations (McMahan et al., 2017; Qinbin et al., 2023).

Whereas traditional ML methods require centralizing data into a single training set, federated learning is partially or fully decentralized. This decentralization helps prevent several organizations from controlling all the data and computational resources needed for training models. Federated learning has also shown great promise in reducing overall emissions by improving smart monitoring devices that track sustainability metrics such as energy emissions (Victor et al., 2022; Zhang et al., 2022). Finally, federated learning enables a group of clients to collaboratively train a single ML model without sharing training data with one another, better preserving the privacy of each client's users.

In this paper, we begin by providing an overview of the different types of federated learning systems. Next, we describe the current challenges in the field and introduce ongoing research into potential solutions. Our main contribution lies in analyzing how federated learning systems can mitigate the adverse effects of deep learning systems. We conclude that federated learning systems must improve data protection during client–server communications and reduce the overall communication overhead to effectively address issues such as economic concentration, increased energy emissions, and data privacy violations.

Overview of federated learning

In federated learning, a group of clients (devices, servers, or organizations) work together to train a single ML model without exchanging their raw data (Qinbin et al., 2023). This process allows individual clients to control what data they share with others, giving them greater control over users’ data privacy. In a federated learning system with a central server, the training process typically involves four repeated steps (Yang et al., 2019):

participating clients calculate the information needed to update a ML model, such as gradients;

the clients apply privacy protection techniques, such as differential privacy and homomorphic encryption (HE), to the information and send it to a central server;

the central server aggregates this information to update a global ML model;

the server redistributes the updated model to the nodes.

For applications that require complete decentralization, clients may follow this same process, replacing the central server with a peer-to-peer consensus mechanism for aggregating the client information into a global ML model (Abdulrahman et al., 2021). Compared to traditional ML techniques, federated learning is far more decentralized, giving clients far more ownership over privacy protection.

Types of federated learning

Depending on the distribution of data among clients, federated learning can be separated into two broad categories (Yang et al., 2019): horizontal and vertical federated learning.

Horizontal federated learning involves data distribution in which clients have different samples with the same features (Li et al., 2020a). For example, hospitals, no matter where they are, will likely record similar features about their patients: age, gender, height, weight, medical history, etc. Yet hospitals in different regions will see different patients, meaning they have different data samples. Although pooling data might be attractive for hospitals hoping to leverage additional data to build more accurate ML models, sharing such sensitive medical data is often legally impermissible (US Department of Health & Human Services, 1996). Horizontal federated learning allows these hospitals to train a single model that learns from the combined set of patients from all the hospitals, without requiring any individual hospital to share sensitive data about its patients with others.

Vertical federated learning, by contrast, is used when clients have the same data samples but different features (Qun et al., 2023). Consider predicting whether someone is at risk for type-2 diabetes as an example. Past research has shown that high blood pressure, obesity, lack of exercise, and high caloric intake are all factors that may contribute to type-2 diabetes. An accurate model would likely incorporate these features, yet it is unlikely that a single client will have access to all of them. A hospital may have access to a person's blood pressure and obesity status, while a smartphone health app may have access to step count and caloric intake. These clients have access to different features about the same person. Vertical federated learning allows clients to pool their data about an individual to train a single model that uses more features, without sharing their proprietary data with one another.

Challenges with federated learning

Although an increasingly popular technique (Qinbin et al., 2023), federated learning faces five challenges that make practical implementation difficult (Li et al., 2020b): privacy concerns, communication overhead, system heterogeneity, statistical heterogeneity, and bias in models. In this section, we describe these challenges and introduce related research.

Privacy concerns. While clients do not transmit raw data, research has shown that personal information can be extracted from the data they do share (Melis et al., 2018): model gradients and global model weights. In response, researchers have emphasized the need for secure computation and differential privacy techniques (Mothukuri et al., 2021).

Secure computation only reveals the end-result of a computation, while protecting the original inputs and intermediate steps. The two main secure computation techniques—secure multi-party computation (SMPC) and HE—require substantial communication, making them ill-suited to federated learning systems with large models or many clients (Truong et al., 2021). In addition, neither technique is secure when a malicious actor can eavesdrop on the client–server communications (Asad et al., 2020; Kairouz et al., 2021). Differential privacy (Dwork et al., 2006) is another prominent privacy-preserving technique. It adds noise to a dataset to mask the attributes of specific individuals, promoting user privacy by making it harder to infer individual-level characteristics (Gosselin et al., 2022). Differential privacy has been adapted to federated learning with some success (Geyer et al., 2018) but may harm overall model accuracy. Even with these state-of-the-art methods implemented properly, an adversary will be able to exfiltrate user data from a federated learning system if they can either (1) view client–server communications or (2) compromise the central server itself.

A malicious actor could compromise a central server as one of the clients contributing model updates, called “backdoor attacks” (Bagdasaryan et al., 2020), or through traditional cyber-attacks. In the first case, researchers have found adversaries can recover data from a federated learning system so long as they have access to a single client sharing model updates with the central server (Wang et al., 2020; Xie et al., 2020). Although promising, none of the methods for detecting or defending against these attacks are foolproof (Rieger et al., 2022; Wu et al., 2021; Xie et al., 2020). As such, a single malicious client could make the entire system insecure, creating a very large attack surface for potential adversaries. In the second case, standard cybersecurity defenses can mitigate against these attacks, but a malicious actor might own the central server itself, creating a federated learning system to steal data from contributing clients.

Communication overhead. Because federated learning requires clients to repeatedly share information, it has much greater communication requirements than traditional ML. This communication overhead can make the training process slower and more expensive. Wu et al. (2022) propose a more efficient federated learning algorithm called FedKD to reduce the communication cost. Instead of using a single large model, their method uses a small mentee model and a larger mentor model to perform adaptive knowledge distillation. In FedKD, only the small mentee model is passed between the central server and the client. FedKD further reduces communication costs by using a dynamic gradient approximation method based on singular value decomposition (SVD) to compress the small mentee model's gradient updates. Other researchers have proposed reducing communication overhead through optimized updating (number of communication rounds) (Luping et al., 2019; McMahan et al., 2017; Smith et al., 2018), improved compression (size of each communication) (Caldas et al., 2019; Sattler et al., 2019; Yi et al., 2021), and decentralized training (shares communication overhead across the network) (Chen et al., 2023; Dai et al., 2022; Liu et al., 2019).

System heterogeneity. The clients contributing to the federated learning process are heterogeneous. Their hardware varies in important characteristics such as memory, computational resources, and network bandwidth. Treating devices uniformly allows slower or non-responsive devices to delay the entire training process. The straightforward solution of removing slower or non-responsive devices can lead to a systematic underrepresentation of those clients. Liu et al. (2022) introduce a method called InclusiveFL, which assigns models to clients in a way that accounts for their different capabilities: larger models will be assigned to more powerful clients and smaller ones to weaker clients. In the end, these models are combined to create a model that is large enough to avoid accuracy drop-offs, while still being representative. Researchers are also exploring better sampling techniques (Nishio and Yonetani, 2019) and fault tolerance (Li et al., 2020c) as ways to lessen the impact of system heterogeneity on model performance.

Statistical heterogeneity. Data distributions may vary between clients. A hospital in a sunny place such as Arizona may have higher skin cancer prevalence rates than one in a cloudy place such as Seattle. As such, the data distributions would likely differ between the two. For this non-IID (independent and identically distributed) data, overall model accuracy may decrease. It may also be harder to measure model convergence during training. Ongoing research aims to remedy these problems by training and combining several, separate models instead of only training a single model (Smith et al., 2017).

Bias in models. Federated learning systems may exhibit societal biases. Sensitive attributes, such as race or gender, may contribute to model predictions in an undesirable way (Abay et al., 2022; Shi et al., 2023). Existing ML fairness interventions, which often require access to the full set of training features, may not work for federated learning systems because clients do not share training data with the central server. Qi et al. (2022) proposed FairVFL to reduce bias in federated learning models. FairVFL partitions the feature set into those that should (fairness-insensitive) and should not (fairness-sensitive) be used to make predictions. After partitioning the datasets, FairVFL learns a unified representation of the data that is independent from the fairness-sensitive features (such as gender). The algorithm then performs adversarial learning twice: first to remove any remaining bias and second to remove any individual information. Researchers have also proposed solutions that mitigate bias by modifying the client selection process (Huang et al., 2020; Yang et al., 2020a), reweighting gradient values (Mohri et al., 2019; Wang et al., 2021), and performing algorithmic optimization (Cui et al., 2021; Qi et al., 2022).

Societal impacts of federated learning

In this section, we describe how federated learning systems might be used to mitigate AI systems’ adverse effects of economic concentration, increased energy emissions, and data privacy violations. These three aspects represent some of the most pressing concerns associated with modern AI development. Economic concentration arises from the consolidation of data and resources by a small number of tech giants, leading to inequitable access to AI technologies. Increased energy emissions are a critical issue due to the high computational demands of training large AI models, which contribute to environmental degradation. Data privacy violations are also a significant concern, as the vast data requirements for AI training often lead to unethical data collection and potential breaches of user privacy. Addressing these issues is crucial for ensuring that AI technologies contribute positively to society.

Equitable decentralized business models

Because modern deep learning systems require extensive data and computational resources, only the largest, most well-capitalized companies can afford to train advanced AI systems. This phenomenon makes it harder for smaller companies that may lack access to large datasets or computational resources to compete, concentrating the benefits of AI systems with large organizations. Federated learning can help alleviate this trend by enabling decentralized training while maintaining data privacy. However, it is important to acknowledge that the models themselves may still belong to major companies, which can limit the potential to fully decentralize economic power. While federated learning allows clients to maintain greater control over their data, this does not automatically lead to a redistribution of power between large and small companies. Instead, it helps reduce the imbalance between companies and individual users by giving users more control over how their data is used. Yang et al. (2019) give one example of how federated learning can be used with profit-sharing rules to give smaller organizations greater leverage and ownership over the ML development life cycle. As a result, new decentralized business models that better incentivize clients to contribute data and resources may emerge. Instead of profits accruing with a single data or model owner, the profits can be more equitably distributed amongst contributors.

Monitoring improvements versus higher emissions

Federated learning also has the potential to advance environmental sustainability efforts by improving “internet-of-things” devices (Hu et al., 2018; Victor et al., 2022; Zhang et al., 2022). Improved monitoring can help organizations better evaluate energy-saving initiatives (Varlamis et al., 2022). Yet, despite improved monitoring devices, federated learning's high communication overhead may ultimately consume more energy than it saves. (Qiu et al. (2021) found that, depending on the exact system design, federated learning systems can emit up to two orders of magnitude more carbon than centralized ones. Research into closing that gap is ongoing, but is still in its nascent stages (Guler and Yener, 2021; Yang et al., 2020b). In its current form, the incremental energy requirements of federated learning systems may simply be prohibitively costly. Federated learning still needs to improve communication efficiency before it can be used to help reduce the energy emissions of advanced AI systems.

Double-edged impact on data privacy

Federated learning has the potential to greatly improve data privacy and security (Ge et al., 2020). Federated learning can improve privacy by limiting how much data clients must share to train a ML model. Unlike traditional ML methods that aggregate data into a single dataset managed by a central party, federated learning ensures no party can access data contributed by another. This practice of data localization also gives clients control over how they share their data at a very granular level. For example, they might opt out of sharing data specific tasks, models, or steps during training. This decision could occur at the device level for individuals or the dataset level for organizations. It also improves security by reducing single-point-of-failure risks; since each client can only access its own data, the fallout from data breaches is limited to the compromised machines. Perfectly implemented, a federated learning system guarantees that participating in the model training process would not expose any more data than not participating. In fact, it may be one of the only techniques that can meet strict General Data Protection Regulation (GDPR) privacy requirements (Truong et al., 2021).

These improvements, however, only occur if the system is perfectly implemented. Compared to centralized ML methods, federated learning offers potential adversaries more opportunities to exfiltrate data. The training process for federated learning has more rounds of communication and gives more third-party clients access to the system, which both create far more entry points for malicious actors to compromise the system. The current states of secure computation, differential privacy, and backdoor attack defenses suggest federated learning systems would be ill-equipped to defend so many failure points.

In the best case, federated learning redistributes privacy risk: whereas a traditional ML system consolidates client data into a single point-of-failure, a federated learning system keeps that data decentralized. In this way, although there are more potential vulnerabilities, each breach would leak a much smaller amount of user data. Given the current number of vulnerabilities with federated learning privacy protection techniques, achieving the best case is unlikely. Most likely, preserving data localization with federated systems will be difficult, which means data breaches will leak data from clients beyond the compromised machines. Although federated learning has the potential to redefine and revolutionize how we protect user data, just as with environmental protection, it still needs more time to mature. Even then, federated learning represents a double-edged sword that redistributes privacy risks in a way that may greatly benefit or harm individual users’ privacy.

Conclusion

This paper introduced federated learning as a potential remedy for the negative societal impacts of advancements in large AI models, such as economic concentration, environmental damage, and privacy violations. Although federated learning, as a decentralized ML method, shows significant promise in addressing these issues, it must first overcome several technical challenges before achieving widespread adoption. These challenges include privacy concerns related to data leakage, communication overhead, system and statistical heterogeneity, and biases within models. We reviewed past research efforts to address these challenges and found that federated learning has the potential to foster more equitable business models through decentralized incentive structures, enhance sustainability through accurate environmental monitoring, and protect user privacy via data localization. However, as a nascent technology, federated learning still requires considerable technical advancement before large-scale implementation is feasible. With continued research and development, federated learning could play a key role in balancing the goals of creating powerful AI systems while ensuring socially responsible outcomes.

Footnotes

Acknowledgment

Thanks to Fangzhao Wu for his help understanding past research related to federated learning.

Contributorship

Conceptualization: Justin Curl and Xing Xie; Writing the original draft: Justin Curl; Review and editing: Justin Curl and Xing Xie.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Justin Curl

Xing Xie

References

Abay

Zhou

Baracaldo

, et al. (2022) Federated learning and fairness. In: Ludwig

Baracaldo

(eds) Federated Learning: A Comprehensive Overview of Methods and Applications. Cham: Springer International Publishing, pp.177–191.

Abdulrahman

Tout

Ould-Slimane

, et al. (2021) A survey on federated learning: the journey from centralized to distributed on-site learning and beyond. IEEE Internet of Things Journal 8(7): 5476–5497.

Aho

Duffield

(2020) Beyond surveillance capitalism: Privacy, regulation and big data in Europe and China. Economy and Society 49(2): 187–212.

Asad

Moustafa

(2020) A critical evaluation of privacy and security threats in federated learning. Sensors 20(24): 7182.

Bagdasaryan

Veit

Hua

, et al. (2020) How to backdoor federated learning. In: Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics , 3 June 2020, pp.2938–2948. PMLR. Available at: https://proceedings.mlr.press/v108/bagdasaryan20a.html (accessed 3 July 2023).

Brown

Mann

Ryder

, et al. (2020) Language models are few-shot learners. arXiv:2005.14165. DOI: https://doi.org/10.48550/arXiv.2005.14165.

Caldas

Konečny

McMahan

, et al. (2019) Expanding the reach of federated learning by reducing client resource requirements. arXiv:1812.07210. DOI: https://doi.org/10.48550/arXiv.1812.07210.

Chen

Qin

, et al. (2023) MetaFed: Federated learning among federations with cyclic knowledge distillation for personalized healthcare. arXiv:2206.08516. Available at: http://arxiv.org/abs/2206.08516 (accessed 2 July 2023).

Cui

Pan

Liang

, et al. (2021) Addressing algorithmic disparity and performance inconsistency in federated learning. arXiv:2108.08435. DOI: https://doi.org/10.48550/arXiv.2108.08435.

10.

Dai

Shen

, et al. (2022) Dispfl: Towards communication-efficient personalized federated learning via decentralized sparse training. arXiv:2206.00187.

11.

Dwork

Kenthapadi

McSherry

, et al. (2006) Our data, ourselves: privacy via distributed noise generation. In: Vaudenay

(ed.) Advances in Cryptology - EUROCRYPT 2006. Berlin, Heidelberg: Springer, pp.486–503.

12.

, et al. (2020) FedNER: Privacy-preserving medical named entity recognition with federated learning. arXiv:2003.09288. Available at: http://arxiv.org/abs/2003.09288 (accessed 2 July 2023).

13.

Geyer

Klein

Nabi

(2018) Differentially private federated learning: A client level perspective. arXiv:1712.07557. DOI: https://doi.org/10.48550/arXiv.1712.07557.

14.

Gosselin

Vieu

Loukil

, et al. (2022) Privacy and security in federated learning: A survey. Applied Sciences 12(19): 9901.

15.

Guler

Yener

(2021) Sustainable federated learning. arXiv:2102.11274. Available at: http://arxiv.org/abs/2102.11274 (accessed 9 July 2023).

16.

Gao

Liu

, et al. (2018) Federated region-learning: An edge computing based framework for urban environment sensing. In: 2018 IEEE Global Communications Conference (GLOBECOM) , December 2018, pp.1–7. DOI: https://doi.org/10.1109/GLOCOM.2018.8647649.

17.

(2023) ChatGPT sets record for fastest-growing user base - Analyst note. Reuters, 2 February. Available at: https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/ (accessed 25 July 2023).

18.

Huang

Lin

, et al. (2020) An efficiency-boosting client selection scheme for federated learning with fairness guarantee. IEEE Transactions on Parallel and Distributed Systems 32(7): 1552–1564.

19.

Jumper

Evans

Pritzel

, et al. (2021) Highly accurate protein structure prediction with AlphaFold. Nature 596(7873): 583–589.

20.

Kairouz

McMahan

Avent

, et al. (2021) Advances and open problems in federated learning. arXiv:1912.04977. Available at: http://arxiv.org/abs/1912.04977 (accessed 2 July 2023).

21.

Kuru

de Miguel Beriain

(2022) Your genetic data is my genetic data: Unveiling another enforcement issue of the GDPR. Computer Law & Security Review 47: 105752.

22.

LeCun

Bengio

Hinton

(2015) Deep learning. Nature 521(7553): 436–444.

23.

Fan

Tse

, et al. (2020a) A review of applications in federated learning. Computers & Industrial Engineering 149: 106854.

24.

Sahu

Talwalkar

, et al. (2020b) Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine 37(3): 50–60.

25.

Sahu

Zaheer

, et al. (2020c) Federated optimization in heterogeneous networks. arXiv:1812.06127. DOI: https://doi.org/10.48550/arXiv.1812.06127.

26.

Liu

Zhang

Song

, et al. (2019) Client-edge-cloud hierarchical federated learning. arXiv:1905.06641. DOI: https://doi.org/10.48550/arXiv.1905.06641.

27.

Liu

, et al. (2022) No one left behind: Inclusive federated learning over heterogeneous devices. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining , 14 August 2022, pp.3398–3406. DOI: https://doi.org/10.1145/3534678.3539086.

28.

Luccioni

Viguier

Ligozat

A-L

(2022) Estimating the carbon footprint of BLOOM, a 176B parameter language model. arXiv:2211.02001. DOI: https://doi.org/10.48550/arXiv.2211.02001.

29.

Luping

Wei

(2019) CMFL: Mitigating communication overhead for federated learning. In: 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS), 2019, pp.954–964. IEEE.

30.

McMahan

Moore

Ramage

, et al. (2017) Communication-efficient learning of deep networks from decentralized data. In: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics , 10 April 2017, pp.1273–1282. PMLR. Available at: https://proceedings.mlr.press/v54/mcmahan17a.html (accessed 2 July 2023).

31.

Melis

Song

De Cristofaro

, et al. (2018) Exploiting unintended feature leakage in collaborative learning. arXiv:1805.04049. DOI: https://doi.org/10.48550/arXiv.1805.04049.

32.

Mohri

Sivek

Suresh

(2019) Agnostic federated learning. arXiv:1902.00146. DOI: https://doi.org/10.48550/arXiv.1902.00146.

33.

Mothukuri

Parizi

Pouriyeh

, et al. (2021) A survey on security and privacy of federated learning. Future Generation Computer Systems 115: 619–640.

34.

Nishio

Yonetani

(2019) Client selection for federated learning with heterogeneous resources in mobile edge. In: ICC 2019 - 2019 IEEE International Conference on Communications (ICC) , May 2019, pp.1–7. DOI: https://doi.org/10.1109/ICC.2019.8761315.

35.

OpenAI (2022) Introducing ChatGPT. Available at: https://openai.com/blog/chatgpt (accessed 10 July 2023).

36.

Paullada

Raji

Bender

, et al. (2021) Data and its (dis)contents: A survey of dataset development and use in machine learning research. Patterns 2(11): 100336.

37.

Pernot-Leplay

(2020) China’s approach on data privacy law: A third way between the US and the EU? Penn State Journal of Law & International Affairs 8: 49. HeinOnline.

38.

, et al. (2022) FairVFL: A fair vertical federated learning framework with contrastive adversarial learning. In: Advances in Neural Information Processing Systems 35(Neurips 2022): 7852–7865.

39.

Qinbin

Wen

, et al. (2023) A survey on federated learning systems: Vision, hype and reality for data privacy and protection. IEEE Transactions on Knowledge and Data Engineering 35(4): 3347–3366.

40.

Qiu

Parcollet

Fernandez-Marques

, et al. (2021) A first look into the carbon footprint of federated learning. Journal of Machine Learning Research 24: 129.

41.

Qun

Thapa

Ong

, et al. (2023) Vertical federated learning: Taxonomies, threats, and prospects. arXiv:2302.01550. DOI: https://doi.org/10.48550/arXiv.2302.01550.

42.

Rieger

Nguyen

Miettinen

, et al. (2022) DeepSight: Mitigating backdoor attacks in federated learning through deep model inspection. In: Proceedings 2022 Network and Distributed System Security Symposium , 2022. DOI: https://doi.org/10.14722/ndss.2022.23156.

43.

Rombach

Blattmann

Lorenz

, et al. (2022) High-resolution image synthesis with latent diffusion models. arXiv:2112.10752. DOI: https://doi.org/10.48550/arXiv.2112.10752.

44.

Sattler

Wiedemann

Müller

K-R

, et al. (2019) Robust and communication-efficient federated learning from non-IID data. IEEE Transactions on Neural Networks and Learning Systems 31(9): 3400–3413.

45.

Shi

Leung

(2023) Towards fairness-aware federated learning. IEEE Transactions on Neural Networks and Learning Systems 35(9): 11922–11938.

46.

Silver

Huang

Maddison

, et al. (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587): 484–489.

47.

Smith

Chiang

C-K

Sanjabi

, et al. (2017) Federated multi-task learning. In: Advances in Neural Information Processing Systems, 2017. Curran Associates, Inc. Available at: https://papers.nips.cc/paper_files/paper/2017/hash/6211080fa89981f66b1a0c9d55c61d0f-Abstract.html (accessed 2 July 2023).

48.

Smith

Forte

, et al. (2018) CoCoA: A general framework for communication-efficient distributed optimization. arXiv:1611.02189. DOI: https://doi.org/10.48550/arXiv.1611.02189.

49.

Truong

Sun

Wang

, et al. (2021) Privacy preservation in federated learning: An insightful survey from the GDPR perspective. Computers & Security 110: 102402.

50.

US Department of Health & Human Services (1996) Health insurance portability and accountability act of 1996. Available at: https://www.hhs.gov/hipaa/for-professionals/privacy/laws-regulations/index.html.

51.

Varlamis

Sardianos

Chronis

, et al. (2022) Using big data and federated learning for generating energy efficiency recommendations. International Journal of Data Science and Analytics 16(3): 353–369.

52.

Victor

Rajeswari

Alazab

, et al. (2022) Federated learning for iout: Concepts, applications, challenges and opportunities. arXiv:2207.13976. Available at: http://arxiv.org/abs/2207.13976 (accessed 9 July 2023).

53.

Wang

Sreenivasan

Rajput

, et al. (2020) Attack of the tails: Yes, you really can backdoor federated learning. arXiv:2007.05084. DOI: https://doi.org/10.48550/arXiv.2007.05084.

54.

Wang

Fan

, et al. (2021) Federated learning with fair averaging. In: Twenty-ninth International Joint Conference on Artificial Intelligence , 9 August 2021, pp.1615–1623. DOI: https://doi.org/10.24963/ijcai.2021/223.

55.

Lyu

, et al. (2022) Communication-efficient federated learning via knowledge distillation. Nature Communications 13(1): 2032.

56.

Yang

Zhu

, et al. (2021) Mitigating backdoor attacks in federated learning. arXiv:2011.01767. Available at: http://arxiv.org/abs/2011.01767 (accessed 3 July 2023).

57.

Xie

Huang

Chen

P-Y

, et al. (2020) DBA: Distributed backdoor attacks against federated learning. In: International Conference on Learning Representations , 26 April 2020. Available at: https://research.ibm.com/publications/dba-distributed-backdoor-attacks-against-federated-learning (accessed 3 July 2023).

58.

Yang

Wong

Zhu

, et al. (2020a) Federated learning with class imbalance reduction. arXiv:2011.11266. DOI: https://doi.org/10.48550/arXiv.2011.11266.

59.

Yang

Liu

Chen

, et al. (2019) Federated machine learning: concept and applications. arXiv:1902.04885. Available at: http://arxiv.org/abs/1902.04885 (accessed 2 July 2023).

60.

Yang

Chen

Saad

, et al. (2020b) Energy efficient federated learning over wireless communication networks. arXiv:1911.02417. DOI: https://doi.org/10.48550/arXiv.1911.02417.

61.

, et al. (2021) Efficient-FedRec: Efficient federated learning framework for privacy-preserving news recommendation. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing , Online and Punta Cana, Dominican Republic, November 2021, pp.2814–2824. Association for Computational Linguistics. DOI: https://doi.org/10.18653/v1/2021.emnlp-main.223.

62.

Zhang

Gao

, et al. (2022) Federated learning for internet of things: Applications, challenges, and opportunities. arXiv:2111.07494. Available at: http://arxiv.org/abs/2111.07494 (accessed 9 July 2023).

Societal impacts and opportunities of federated learning

Abstract

Keywords

Artificial intelligence and its societal concerns

Overview of federated learning

Types of federated learning

Challenges with federated learning

Societal impacts of federated learning

Equitable decentralized business models

Monitoring improvements versus higher emissions

Double-edged impact on data privacy

Conclusion

Footnotes

Acknowledgment

Contributorship

Declaration of conflicting interests

Funding

ORCID iDs

References