提示:原仓库的FL代码是基于 这个仓库 实现的,目前已按论文工作作过大量改动,但框架仍适用。
数据集: Cifar-10, FEMNIST
模型: Resnet-18
DP机制: Gaussian(Simple Composition)
DP参数:
DP裁剪: 在机器学习任务中,为了计算敏感度,通常情况下需要对梯度进行裁剪,限制梯度的一范数或者二范数的值。
python == 3.12
另外原仓库引用 opacus == 1.1.1 进行逐样本梯度裁剪,改动后已删除。 opacus 占用内存过高,现用简易的模块实现汇总梯度的裁剪。
环境搭配完成后运行主程序 main_resnet.py 即可。
基于DP中的Simple Composition。
也就是说,如果某个客户端的隐私预算是
- 综述
- Rodríguez-Barroso, Nuria, et al. "Federated Learning and Differential Privacy: Software tools analysis, the Sherpa. ai FL framework and methodological guidelines for preserving data privacy." Information Fusion 64 (2020): 270-292.
- 高斯机制
- Wei, Kang, et al. "Federated learning with differential privacy: Algorithms and performance analysis." IEEE Transactions on Information Forensics and Security 15 (2020): 3454-3469.
- Y. Zhou, et al.,"Optimizing the Numbers of Queries and Replies in Convex Federated Learning with Differential Privacy" in IEEE Transactions on Dependable and Secure Computing, 2023.
- K. Wei, et al.,"User-Level Privacy-Preserving Federated Learning: Analysis and Performance Optimization" in IEEE Transactions on Mobile Computing, vol. 21, no. 09, pp. 3388-3401, 2022.
- Geyer, Robin C., Tassilo Klein, and Moin Nabi. "Differentially private federated learning: A client level perspective." arXiv preprint arXiv:1712.07557 (2017).
- Seif, Mohamed, Ravi Tandon, and Ming Li. "Wireless federated learning with local differential privacy." 2020 IEEE International Symposium on Information Theory (ISIT). IEEE, 2020.
- Mohammadi, Nima, et al. "Differential privacy meets federated learning under communication constraints." IEEE Internet of Things Journal (2021).
- Truex, Stacey, et al. "A hybrid approach to privacy-preserving federated learning." Proceedings of the 12th ACM workshop on artificial intelligence and security. 2019.
- Naseri, Mohammad, Jamie Hayes, and Emiliano De Cristofaro. "Toward robustness and privacy in federated learning: Experimenting with local and central differential privacy." arXiv e-prints (2020): arXiv-2009.
- Malekzadeh, Mohammad, et al. "Dopamine: Differentially private federated learning on medical data." arXiv preprint arXiv:2101.11693 (2021).
- 拉普拉斯机制
- Wu, Nan, et al. "The value of collaboration in convex machine learning with differential privacy." 2020 IEEE Symposium on Security and Privacy (SP). IEEE, 2020.
- Y. Zhou, et al.,"Optimizing the Numbers of Queries and Replies in Convex Federated Learning with Differential Privacy" in IEEE Transactions on Dependable and Secure Computing, 2023.
- L. Cui, J. Ma, Y. Zhou and S. Yu, "Boosting Accuracy of Differentially Private Federated Learning in Industrial IoT With Sparse Responses," in IEEE Transactions on Industrial Informatics, 2023.
- Liu, Xiaoyuan, et al. "Adaptive privacy-preserving federated learning." Peer-to-Peer Networking and Applications 13.6 (2020): 2356-2366.
- Zhao, Yang, et al. "Local differential privacy-based federated learning for internet of things." IEEE Internet of Things Journal 8.11 (2020): 8836-8853.
- Fu, Yao, et al. "On the practicality of differential privacy in federated learning by tuning iteration times." arXiv preprint arXiv:2101.04163 (2021).
- 其他机制
- Zhao, Yang, et al. "Local differential privacy-based federated learning for internet of things." IEEE Internet of Things Journal 8.11 (2020): 8836-8853.
- Truex, Stacey, et al. "LDP-Fed: Federated learning with local differential privacy." Proceedings of the Third ACM International Workshop on Edge Systems, Analytics and Networking. 2020.
- Yang, Jungang, et al. "Matrix Gaussian Mechanisms for Differentially-Private Learning." IEEE Transactions on Mobile Computing (2021).
- Sun, Lichao, Jianwei Qian, and Xun Chen. "Ldp-fl: Practical private aggregation in federated learning with local differential privacy." arXiv preprint arXiv:2007.15789 (2020).
- Liu, Ruixuan, et al. "Fedsel: Federated sgd under local differential privacy with top-k dimension selection." International Conference on Database Systems for Advanced Applications. Springer, Cham, 2020.
新版本用了Opacus进行 Per Sample Gradient Clip,限制了每一条Sample计算出的梯度的范数。
本代码设置了本地训练轮数为1,并且batch size为客户端本地的数据集大小,由于Opacus库的训练会保存所有样本的梯度,因此训练时gpu显存占用非常大。 解决此问题可以通过指定 --serial 和 --serial_bs 参数。
这两个参数会从物理上指定一个虚拟的batch size,相应的训练时间会变长,但是逻辑上不会影响训练和DP噪声的添加,这么做的主要原因是为了不违背DP噪声添加的理论。
Dev分支目前还在完善中,里面实现了包括MA、F-DP、以及Shuffle等新的DPFL算法,欢迎感兴趣的朋友提出宝贵意见!
考虑引用以下文 章:
[1] W. Yang et al., "Gain Without Pain: Offsetting DP-Injected Noises Stealthily in Cross-Device Federated Learning," in IEEE Internet of Things Journal, vol. 9, no. 22, pp. 22147-22157, 15 Nov.15, 2022, doi: 10.1109/JIOT.2021.3102030.
[2] M. Hu et al., "AutoFL: A Bayesian Game Approach for Autonomous Client Participation in Federated Edge Learning," in IEEE Transactions on Mobile Computing, doi: 10.1109/TMC.2022.3227014.
[3] Y. Zhou et al., "Optimizing the Numbers of Queries and Replies in Convex Federated Learning with Differential Privacy," in IEEE Transactions on Dependable and Secure Computing, doi: 10.1109/TDSC.2023.3234599.