期刊文献+
共找到39篇文章
< 1 2 >
每页显示 20 50 100
一种非独立同分布下K-means算法的初始中心优化方法 被引量:7
1
作者 潘品臣 姜合 吕奕锟 《小型微型计算机系统》 CSCD 北大核心 2019年第6期1254-1259,共6页
传统聚类算法研究都是在假设数据集的对象、属性等方面满足独立性且服从同一分布的基础上进行的.然而现实中的数据往往是非独立同分布的,即属性之间或多或少都会存在一些交互关系.传统K-means算法随机地选择初始聚类中心,对于中心点的... 传统聚类算法研究都是在假设数据集的对象、属性等方面满足独立性且服从同一分布的基础上进行的.然而现实中的数据往往是非独立同分布的,即属性之间或多或少都会存在一些交互关系.传统K-means算法随机地选择初始聚类中心,对于中心点的选取比较敏感,容易陷入局部最优且准确率低. Min_max方法针对这一缺点进行了改进,但原始的和改进后的Kmeans算法都忽略了属性之间存在的交互关系.因此本文利用Pearson相关系数公式来计算属性之间的交互关系,并映射于原始数据集.同时利用双领域思想对Min_max方法进行了优化.实验结果表明该方法能够得到较高的准确率、较好的聚类效果以及相对较少的迭代次数. 展开更多
关键词 非独立同分布 K-MEANS算法 初始聚类中心 Pearson相关系数 双领域思想
下载PDF
FedAdaSS: Federated Learning with Adaptive Parameter Server Selection Based on Elastic Cloud Resources
2
作者 Yuwei Xu Baokang Zhao +1 位作者 Huan Zhou Jinshu Su 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第10期609-629,共21页
The rapid expansion of artificial intelligence(AI)applications has raised significant concerns about user privacy,prompting the development of privacy-preserving machine learning(ML)paradigms such as federated learnin... The rapid expansion of artificial intelligence(AI)applications has raised significant concerns about user privacy,prompting the development of privacy-preserving machine learning(ML)paradigms such as federated learning(FL).FL enables the distributed training of ML models,keeping data on local devices and thus addressing the privacy concerns of users.However,challenges arise from the heterogeneous nature of mobile client devices,partial engagement of training,and non-independent identically distributed(non-IID)data distribution,leading to performance degradation and optimization objective bias in FL training.With the development of 5G/6G networks and the integration of cloud computing edge computing resources,globally distributed cloud computing resources can be effectively utilized to optimize the FL process.Through the specific parameters of the server through the selection mechanism,it does not increase the monetary cost and reduces the network latency overhead,but also balances the objectives of communication optimization and low engagement mitigation that cannot be achieved simultaneously in a single-server framework of existing works.In this paper,we propose the FedAdaSS algorithm,an adaptive parameter server selection mechanism designed to optimize the training efficiency in each round of FL training by selecting the most appropriate server as the parameter server.Our approach leverages the flexibility of cloud resource computing power,and allows organizers to strategically select servers for data broadcasting and aggregation,thus improving training performance while maintaining cost efficiency.The FedAdaSS algorithm estimates the utility of client systems and servers and incorporates an adaptive random reshuffling strategy that selects the optimal server in each round of the training process.Theoretical analysis confirms the convergence of FedAdaSS under strong convexity and L-smooth assumptions,and comparative experiments within the FLSim framework demonstrate a reduction in training round-to-accuracy by 展开更多
关键词 Machine learning systems federated learning server selection artificial intelligence of things non-iid data
下载PDF
去中心化场景下的隐私保护联邦学习优化方法
3
作者 侯泽超 董建刚 《计算机应用研究》 CSCD 北大核心 2024年第8期2419-2426,共8页
联邦学习的提出为跨数据孤岛的共同学习提供了新的解决方案,然而联邦节点的本地数据的非独立同分布(Non-IID)特性及中心化框架在参与方监管、追责能力和隐私保护手段上的缺失限制了其大规模应用。针对上述问题,提出了基于区块链的可信... 联邦学习的提出为跨数据孤岛的共同学习提供了新的解决方案,然而联邦节点的本地数据的非独立同分布(Non-IID)特性及中心化框架在参与方监管、追责能力和隐私保护手段上的缺失限制了其大规模应用。针对上述问题,提出了基于区块链的可信切片聚合策略(BBTSA)和联邦归因(FedAom)算法。FedAom引入归因思想,基于积分梯度法获取归因,从而定位影响模型决策行为的参数,分级考虑参数敏感性,在局部更新过程中保持和强化全局模型所学习到的关键知识,有效利用共享数据,从而缓解Non-IID问题。BBTSA基于区块链构建去中心化的联邦学习环境,允许联邦节点在无须中心化第三方的情况下,通过在参与方间交换噪声而非权重或梯度参数,基于合作树结构实现对参数的切片混淆,以保护节点隐私。在两种数据集上的不同分布条件下的验证结果显示,FedAom在大多数条件下相比基线方法在稳定性和收敛速度上都有显著提升。而BBTSA能够隐藏客户端的隐私参数,在不影响精度的情况下确保了训练过程的全程监控和隐私安全。 展开更多
关键词 联邦学习 区块链 隐私保护 非独立同分布 积分梯度 归因
下载PDF
面向异构环境的物联网入侵检测方法
4
作者 刘静 慕泽林 赖英旭 《通信学报》 EI CSCD 北大核心 2024年第4期114-127,共14页
为了解决物联网设备在资源受限和数据非独立同分布(Non-IID)时出现的训练效率低、模型性能差的问题,提出了一种个性化剪枝联邦学习框架用于物联网的入侵检测。首先,提出了一种基于通道重要性评分的结构化剪枝策略,该策略通过平衡模型的... 为了解决物联网设备在资源受限和数据非独立同分布(Non-IID)时出现的训练效率低、模型性能差的问题,提出了一种个性化剪枝联邦学习框架用于物联网的入侵检测。首先,提出了一种基于通道重要性评分的结构化剪枝策略,该策略通过平衡模型的准确率与复杂度来生成子模型下发给资源受限客户端。其次,提出了一种异构模型聚合算法,对通道采用相似度加权系数进行加权平均,有效降低了Non-IID数据在模型聚合中的负面影响。最后,网络入侵数据集BoT-IoT上的实验结果表明,相较于现有方法,所提方法能显著降低资源受限客户端的时间开销,处理速度提升20.82%,并且在Non-IID场景下,入侵检测的准确率提高0.86%。 展开更多
关键词 联邦学习 入侵检测 模型剪枝 非独立同分布
下载PDF
Federated learning on non-IID and long-tailed data viadual-decoupling
5
作者 Zhaohui WANG Hongjiao LI +2 位作者 Jinguo LI Renhao HU Baojin WANG 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2024年第5期728-741,共14页
Federated learning(FL),a cutting-edge distributed machine learning training paradigm,aims to generate a global model by collaborating on the training of client models without revealing local private data.The co-occurr... Federated learning(FL),a cutting-edge distributed machine learning training paradigm,aims to generate a global model by collaborating on the training of client models without revealing local private data.The co-occurrence of non-independent and identically distributed(non-IID)and long-tailed distribution in FL is one challenge that substantially degrades aggregate performance.In this paper,we present a corresponding solution called federated dual-decoupling via model and logit calibration(FedDDC)for non-IID and long-tailed distributions.The model is characterized by three aspects.First,we decouple the global model into the feature extractor and the classifier to fine-tune the components affected by the joint problem.For the biased feature extractor,we propose a client confidence re-weighting scheme to assist calibration,which assigns optimal weights to each client.For the biased classifier,we apply the classifier re-balancing method for fine-tuning.Then,we calibrate and integrate the client confidence re-weighted logits with the re-balanced logits to obtain the unbiased logits.Finally,we use decoupled knowledge distillation for the first time in the joint problem to enhance the accuracy of the global model by extracting the knowledge of the unbiased model.Numerous experiments demonstrate that on non-IID and long-tailed data in FL,our approach outperforms state-of-the-art methods. 展开更多
关键词 Federated learning non-iid Long-tailed data Decoupling learning Knowledge distillation
原文传递
Ada-FFL:Adaptive computing fairness federated learning
6
作者 Yue Cong Jing Qiu +4 位作者 Kun Zhang Zhongyang Fang Chengliang Gao Shen Su Zhihong Tian 《CAAI Transactions on Intelligence Technology》 SCIE EI 2024年第3期573-584,共12页
As the scale of federated learning expands,solving the Non-IID data problem of federated learning has become a key challenge of interest.Most existing solutions generally aim to solve the overall performance improveme... As the scale of federated learning expands,solving the Non-IID data problem of federated learning has become a key challenge of interest.Most existing solutions generally aim to solve the overall performance improvement of all clients;however,the overall performance improvement often sacrifices the performance of certain clients,such as clients with less data.Ignoring fairness may greatly reduce the willingness of some clients to participate in federated learning.In order to solve the above problem,the authors propose Ada-FFL,an adaptive fairness federated aggregation learning algorithm,which can dynamically adjust the fairness coefficient according to the update of the local models,ensuring the convergence performance of the global model and the fairness between federated learning clients.By integrating coarse-grained and fine-grained equity solutions,the authors evaluate the deviation of local models by considering both global equity and individual equity,then the weight ratio will be dynamically allocated for each client based on the evaluated deviation value,which can ensure that the update differences of local models are fully considered in each round of training.Finally,by combining a regularisation term to limit the local model update to be closer to the global model,the sensitivity of the model to input perturbations can be reduced,and the generalisation ability of the global model can be improved.Through numerous experiments on several federal data sets,the authors show that our method has more advantages in convergence effect and fairness than the existing baselines. 展开更多
关键词 adaptive fariness aggregation FAIRNESS federated learning non-iid
下载PDF
高效联邦学习:范数加权聚合算法
7
作者 陈攀 张恒汝 闵帆 《计算机应用研究》 CSCD 北大核心 2024年第3期694-699,共6页
在联邦学习中,跨客户端的非独立同分布(non-IID)数据导致全局模型收敛较慢,通信成本显著增加。现有方法通过收集客户端的标签分布信息来确定本地模型的聚合权重,以加快收敛速度,但这可能会泄露客户端的隐私。为了在不泄露客户端隐私的... 在联邦学习中,跨客户端的非独立同分布(non-IID)数据导致全局模型收敛较慢,通信成本显著增加。现有方法通过收集客户端的标签分布信息来确定本地模型的聚合权重,以加快收敛速度,但这可能会泄露客户端的隐私。为了在不泄露客户端隐私的前提下解决non-IID数据导致的收敛速度降低的问题,提出FedNA聚合算法。该算法通过两种方法来实现这一目标。第一,FedNA根据本地模型类权重更新的L 1范数来分配聚合权重,以保留本地模型的贡献。第二,FedNA将客户端的缺失类对应的类权重更新置为0,以缓解缺失类对聚合的影响。在两个数据集上模拟了四种不同的数据分布进行实验。结果表明,与FedAvg相比,FedNA算法达到稳定状态所需的迭代次数最多可减少890次,降低44.5%的通信开销。FedNA在保护客户端隐私的同时加速了全局模型的收敛速度,降低了通信成本,可用于需要保护用户隐私且对通信效率敏感的场景。 展开更多
关键词 联邦学习 通信成本 隐私保护 非独立同分布 聚合 权重更新
下载PDF
基于Logistic最优化鲁棒性的聚类联邦学习
8
作者 施玉倩 巫朝霞 《软件工程》 2024年第6期15-20,共6页
为了解决联邦学习中数据异构导致模型准确率下降的问题,提出了一种基于Logistic最优化鲁棒性的聚类联邦学习(Logistic-based More Robust Clustered Federated Learning,LMRCFL)方法,将具有相似数据分布的客户端分组到相同的集群中,不... 为了解决联邦学习中数据异构导致模型准确率下降的问题,提出了一种基于Logistic最优化鲁棒性的聚类联邦学习(Logistic-based More Robust Clustered Federated Learning,LMRCFL)方法,将具有相似数据分布的客户端分组到相同的集群中,不需要访问其私有数据,可为每个客户端集群训练模型;在目标函数中引入正则项更新本地损失函数,缓解Non-IID(非独立同分布)数据带来的客户端偏移问题,通过减小模型差异提升模型准确率。在CIFAR-10、fashion-MNIST、SHVN数据集上与其他联邦学习算法进行了对比,实验结果表明,LMRCFL算法在Non-IID数据分布下的准确率提高了8.13百分点~33.20百分点且具有鲁棒性。 展开更多
关键词 联邦学习 数据异构 聚类 非独立同分布 正则化
下载PDF
面向Non-IID数据的拜占庭鲁棒联邦学习 被引量:2
9
作者 马鑫迪 李清华 +4 位作者 姜奇 马卓 高胜 田有亮 马建峰 《通信学报》 EI CSCD 北大核心 2023年第6期138-153,共16页
面向数据分布特征为非独立同分布的联邦学习拜占庭节点恶意攻击问题进行研究,提出了一种隐私保护的鲁棒梯度聚合算法。该算法设计参考梯度用于识别模型训练中“质量较差”的共享梯度,并通过信誉度评估来降低数据分布异质对拜占庭节点识... 面向数据分布特征为非独立同分布的联邦学习拜占庭节点恶意攻击问题进行研究,提出了一种隐私保护的鲁棒梯度聚合算法。该算法设计参考梯度用于识别模型训练中“质量较差”的共享梯度,并通过信誉度评估来降低数据分布异质对拜占庭节点识别的影响。同时,结合同态加密和随机噪声混淆技术来保护模型训练和拜占庭节点识别过程中的用户隐私。最后,在真实数据集中进行仿真测试,测试结果表明所提算法能够在保护用户隐私的条件下,准确、高效地识别拜占庭攻击节点,具有较好的收敛性和鲁棒性。 展开更多
关键词 联邦学习 拜占庭攻击 非独立同分布 隐私保护 同态加密
下载PDF
Blockchain-Enabled Federated Learning for Privacy-Preserving Non-IID Data Sharing in Industrial Internet
10
作者 Qiuyan Wang Haibing Dong +2 位作者 Yongfei Huang Zenglei Liu Yundong Gou 《Computers, Materials & Continua》 SCIE EI 2024年第8期1967-1983,共17页
Sharing data while protecting privacy in the industrial Internet is a significant challenge.Traditional machine learning methods require a combination of all data for training;however,this approach can be limited by d... Sharing data while protecting privacy in the industrial Internet is a significant challenge.Traditional machine learning methods require a combination of all data for training;however,this approach can be limited by data availability and privacy concerns.Federated learning(FL)has gained considerable attention because it allows for decentralized training on multiple local datasets.However,the training data collected by data providers are often non-independent and identically distributed(non-IID),resulting in poor FL performance.This paper proposes a privacy-preserving approach for sharing non-IID data in the industrial Internet using an FL approach based on blockchain technology.To overcome the problem of non-IID data leading to poor training accuracy,we propose dynamically updating the local model based on the divergence of the global and local models.This approach can significantly improve the accuracy of FL training when there is relatively large dispersion.In addition,we design a dynamic gradient clipping algorithm to alleviate the influence of noise on the model accuracy to reduce potential privacy leakage caused by sharing model parameters.Finally,we evaluate the performance of the proposed scheme using commonly used open-source image datasets.The simulation results demonstrate that our method can significantly enhance the accuracy while protecting privacy and maintaining efficiency,thereby providing a new solution to data-sharing and privacy-protection challenges in the industrial Internet. 展开更多
关键词 Federated learning data sharing non-iid data differential privacy blockchain
下载PDF
异构边缘计算环境下异步联邦学习的节点分组与分时调度策略 被引量:1
11
作者 马千飘 贾庆民 +3 位作者 刘建春 徐宏力 谢人超 黄韬 《通信学报》 EI CSCD 北大核心 2023年第11期79-93,共15页
为了克服异构边缘计算环境下联邦学习的3个关键挑战,边缘异构性、非独立同分布数据及通信资源约束,提出了一种分组异步联邦学习(FedGA)机制,将边缘节点分为多个组,各个分组间通过异步方式与全局模型聚合进行全局更新,每个分组内部节点... 为了克服异构边缘计算环境下联邦学习的3个关键挑战,边缘异构性、非独立同分布数据及通信资源约束,提出了一种分组异步联邦学习(FedGA)机制,将边缘节点分为多个组,各个分组间通过异步方式与全局模型聚合进行全局更新,每个分组内部节点通过分时方式与参数服务器通信。理论分析建立了FedGA的收敛界与分组间数据分布之间的定量关系。针对分组内节点的通信提出了分时调度策略魔镜法(MMM)优化模型单轮更新的完成时间。基于FedGA的理论分析和MMM,设计了一种有效的分组算法来最小化整体训练的完成时间。实验结果表明,FedGA和MMM相对于现有最先进的方法能降低30.1%~87.4%的模型训练时间。 展开更多
关键词 边缘计算 联邦学习 非独立同分布数据 异构性 收敛分析
下载PDF
面向非独立同分布数据的联邦学习数据增强方案 被引量:1
12
作者 汤凌韬 王迪 刘盛云 《通信学报》 EI CSCD 北大核心 2023年第1期164-176,共13页
为了解决联邦学习节点间数据非独立同分布(non-IID)导致的模型精度不理想的问题,提出一种隐私保护的数据增强方案。首先,提出了面向联邦学习的数据增强框架,参与节点在本地生成虚拟样本并在节点间共享,有效缓解了训练过程中数据分布差... 为了解决联邦学习节点间数据非独立同分布(non-IID)导致的模型精度不理想的问题,提出一种隐私保护的数据增强方案。首先,提出了面向联邦学习的数据增强框架,参与节点在本地生成虚拟样本并在节点间共享,有效缓解了训练过程中数据分布差异导致的模型偏移问题。其次,基于生成式对抗网络和差分隐私技术,设计了隐私保护的样本生成算法,在保证原数据隐私的前提下生成可用的虚拟样本。最后,提出了隐私保护的标签选取算法,保证虚拟样本的标签同样满足差分隐私。仿真结果表明,在多种non-IID数据划分策略下,所提方案均能有效提高模型精度并加快模型收敛,与基准方法相比,所提方案在极端non-IID场景下能取得25%以上的精度提升。 展开更多
关键词 联邦学习 非独立同分布 生成式对抗网络 差分隐私 数据增强
下载PDF
Non-IID Recommender Systems: A Review and Framework of Recommendation Paradigm Shifting 被引量:1
13
作者 Longbing Cao 《工程(英文)》 2016年第2期212-224,229-243,共28页
下载PDF
A Client Selection Method Based on Loss Function Optimization for Federated Learning
14
作者 Yan Zeng Siyuan Teng +4 位作者 Tian Xiang Jilin Zhang Yuankai Mu Yongjian Ren Jian Wan 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第10期1047-1064,共18页
Federated learning is a distributedmachine learningmethod that can solve the increasingly serious problemof data islands and user data privacy,as it allows training data to be kept locally and not shared with other us... Federated learning is a distributedmachine learningmethod that can solve the increasingly serious problemof data islands and user data privacy,as it allows training data to be kept locally and not shared with other users.It trains a globalmodel by aggregating locally-computedmodels of clients rather than their rawdata.However,the divergence of local models caused by data heterogeneity of different clients may lead to slow convergence of the global model.For this problem,we focus on the client selection with federated learning,which can affect the convergence performance of the global model with the selected local models.We propose FedChoice,a client selection method based on loss function optimization,to select appropriate local models to improve the convergence of the global model.It firstly sets selected probability for clients with the value of loss function,and the client with high loss will be set higher selected probability,which can make them more likely to participate in training.Then,it introduces a local control vector and a global control vector to predict the local gradient direction and global gradient direction,respectively,and calculates the gradient correction vector to correct the gradient direction to reduce the cumulative deviationof the local gradient causedby theNon-IIDdata.Wemake experiments to verify the validity of FedChoice on CIFAR-10,CINIC-10,MNIST,EMNITS,and FEMNIST datasets,and the results show that the convergence of FedChoice is significantly improved,compared with FedAvg,FedProx,and FedNova. 展开更多
关键词 Federated learning model aggregation non-iid
下载PDF
Accelerating local SGD for non-IID data using variance reduction
15
作者 Xianfeng LIANG Shuheng SHEN +4 位作者 Enhong CHEN Jinchang LIU Qi LIU Yifei CHENG Zhen PAN 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第2期73-89,共17页
Distributed stochastic gradient descent and its variants have been widely adopted in the training of machine learning models,which apply multiple workers in parallel.Among them,local-based algorithms,including Local S... Distributed stochastic gradient descent and its variants have been widely adopted in the training of machine learning models,which apply multiple workers in parallel.Among them,local-based algorithms,including Local SGD and FedAvg,have gained much attention due to their superior properties,such as low communication cost and privacypreserving.Nevertheless,when the data distribution on workers is non-identical,local-based algorithms would encounter a significant degradation in the convergence rate.In this paper,we propose Variance Reduced Local SGD(VRL-SGD)to deal with the heterogeneous data.Without extra communication cost,VRL-SGD can reduce the gradient variance among workers caused by the heterogeneous data,and thus it prevents local-based algorithms from slow convergence rate.Moreover,we present VRL-SGD-W with an effectivewarm-up mechanism for the scenarios,where the data among workers are quite diverse.Benefiting from eliminating the impact of such heterogeneous data,we theoretically prove that VRL-SGD achieves a linear iteration speedup with lower communication complexity even if workers access non-identical datasets.We conduct experiments on three machine learning tasks.The experimental results demonstrate that VRL-SGD performs impressively better than Local SGD for the heterogeneous data and VRL-SGD-W is much robust under high data variance among workers. 展开更多
关键词 distributed optimization variance reduction local SGD federated learning non-iid data
原文传递
Facing small and biased data dilemma in drug discovery with enhanced federated learning approaches 被引量:2
16
作者 Zhaoping Xiong Ziqiang Cheng +8 位作者 Xinyuan Lin Chi Xu Xiaohong Liu Dingyan Wang Xiaomin Luo Yong Zhang Hualiang Jiang Nan Qiao Mingyue Zheng 《Science China(Life Sciences)》 SCIE CAS CSCD 2022年第3期529-539,共11页
Artificial intelligence(AI)models usually require large amounts of high-quality training data,which is in striking contrast to the situation of small and biased data faced by current drug discovery pipelines.The conce... Artificial intelligence(AI)models usually require large amounts of high-quality training data,which is in striking contrast to the situation of small and biased data faced by current drug discovery pipelines.The concept of federated learning has been proposed to utilize distributed data from different sources without leaking sensitive information of the data.This emerging decentralized machine learning paradigm is expected to dramatically improve the success rate of AI-powered drug discovery.Here,we simulated the federated learning process with different property and activity datasets from different sources,among which overlapping molecules with high or low biases exist in the recorded values.Beyond the benefit of gaining more data,we also demonstrated that federated training has a regularization effect superior to centralized training on the pooled datasets with high biases.Moreover,different network architectures for clients and aggregation algorithms for coordinators have been compared on the performance of federated learning,where personalized federated learning shows promising results.Our work demonstrates the applicability of federated learning in predicting drug-related properties and highlights its promising role in addressing the small and biased data dilemma in drug discovery. 展开更多
关键词 federated learning drug discovery Fed AMP non-iid data
原文传递
Study on CA-CFAR Algorithm Based on Normalization Processing of Background Noise for HI of Optical Fiber 被引量:1
17
作者 Yanping WANG Dandan QU +1 位作者 Chao ZHAO Dan YANG 《Photonic Sensors》 SCIE EI CAS CSCD 2018年第4期341-350,共10页
Optical fiber pre-warning system (OFPS) is often used to monitor the occurrence of disasters such as the leakage of oil and natural gas pipeline. It analyzes the collected vibration signals to judge whether there is... Optical fiber pre-warning system (OFPS) is often used to monitor the occurrence of disasters such as the leakage of oil and natural gas pipeline. It analyzes the collected vibration signals to judge whether there is any harmful intrusion (HI) events. At present, the research in this field is mainly focused on the constant false alarm rate (CFAR) methods and derivative algorithms to detect intrusion signals. However, the performance of CFAR is often limited to the actual collected signals distribution. It is found that the background noise usually obeys non-independent and identically distribution (Non-liD) through the statistical analysis of acquisition signals. In view of the actual signal distribution characteristics, this paper presents a CFAR detection method based on the normalization processing for background noise. A high-pass filter is designed for the actual Non-liD background noise data to obtain the characterization characteristic. Then, the background noise is converted to independent and identically distribution (IID) by using the data characteristic. Next, the collected data after normalization is processed with efficient cell average constant false alarm rate (CA-CFAR) method for detection. Finally, the results of experiments both show that the intrusion signals can be effectively detected, and the effectiveness of the algorithm is verified. 展开更多
关键词 OFPS HI CA-CFAR NORMALIZATION non-iid
原文传递
EMFedAvg——基于EMD距离的联邦平均算法
18
作者 周旭华 丛悦 +1 位作者 李鉴明 仇计清 《广州大学学报(自然科学版)》 CAS 2020年第4期11-20,共10页
信息技术给人们生活带来便利的同时也会泄露个人隐私.联邦学习是一种可以保护数据隐私的机器学习技术,不同于现有的机器学习方法,联邦学习中数据不出参与方本地,通常面临着数据非独立同分布的问题(non-identically Independently Distri... 信息技术给人们生活带来便利的同时也会泄露个人隐私.联邦学习是一种可以保护数据隐私的机器学习技术,不同于现有的机器学习方法,联邦学习中数据不出参与方本地,通常面临着数据非独立同分布的问题(non-identically Independently Distributions, non-IID),因而现有的机器学习方法在联邦学习non-IID问题上效果大大降低.文章针对联邦学习中的non-IID问题,在联邦平均算法的基础上进行改进,对MNIST数据集进行non-IID划分并分发到各参与方,计算各参与方数据的EMD(Earth Mover’s Distance, EMD)距离,以四分位距为上界,主动去掉EMD距离过大的参与方以保证联邦整体的效果.实验结果表明,文章采用的方法比联邦平均算法提高了约5%的准确率,减少了联邦学习训练过程的通信开销,提高了整体效率,引入EMD距离还可以为衡量各参与方的贡献值提供度量依据. 展开更多
关键词 联邦学习 non-iid EMD距离 联邦平均算法
下载PDF
ADC-DL:Communication-Efficient Distributed Learning with Hierarchical Clustering and Adaptive Dataset Condensation
19
作者 Zhipeng Gao Yan Yang +1 位作者 Chen Zhao Zijia Mo 《China Communications》 SCIE CSCD 2022年第12期73-85,共13页
The rapid growth of modern mobile devices leads to a large number of distributed data,which is extremely valuable for learning models.Unfortunately,model training by collecting all these original data to a centralized... The rapid growth of modern mobile devices leads to a large number of distributed data,which is extremely valuable for learning models.Unfortunately,model training by collecting all these original data to a centralized cloud server is not applicable due to data privacy and communication costs concerns,hindering artificial intelligence from empowering mobile devices.Moreover,these data are not identically and independently distributed(Non-IID)caused by their different context,which will deteriorate the performance of the model.To address these issues,we propose a novel Distributed Learning algorithm based on hierarchical clustering and Adaptive Dataset Condensation,named ADC-DL,which learns a shared model by collecting the synthetic samples generated on each device.To tackle the heterogeneity of data distribution,we propose an entropy topsis comprehensive tiering model for hierarchical clustering,which distinguishes clients in terms of their data characteristics.Subsequently,synthetic dummy samples are generated based on the hierarchical structure utilizing adaptive dataset condensation.The procedure of dataset condensation can be adjusted adaptively according to the tier of the client.Extensive experiments demonstrate that the performance of our ADC-DL is more outstanding in prediction accuracy and communication costs compared with existing algorithms. 展开更多
关键词 distributed learning non-iid data partition hierarchical clustering adaptive dataset condensation
下载PDF
A Survey of Federated Learning on Non-IID Data
20
作者 HAN Xuming GAO Minghan +2 位作者 WANG Limin HE Zaobo WANG Yanze 《ZTE Communications》 2022年第3期17-26,共10页
Federated learning(FL) is a machine learning paradigm for data silos and privacy protection,which aims to organize multiple clients for training global machine learning models without exposing data to all parties.Howe... Federated learning(FL) is a machine learning paradigm for data silos and privacy protection,which aims to organize multiple clients for training global machine learning models without exposing data to all parties.However,when dealing with non-independently identically distributed(non-ⅡD) client data,FL cannot obtain more satisfactory results than centrally trained machine learning and even fails to match the accuracy of the local model obtained by client training alone.To analyze and address the above issues,we survey the state-of-theart methods in the literature related to FL on non-ⅡD data.On this basis,a motivation-based taxonomy,which classifies these methods into two categories,including heterogeneity reducing strategies and adaptability enhancing strategies,is proposed.Moreover,the core ideas and main challenges of these methods are analyzed.Finally,we envision several promising research directions that have not been thoroughly studied,in hope of promoting research in related fields to a certain extent. 展开更多
关键词 data heterogeneity federated learning non-iid data
下载PDF
上一页 1 2 下一页 到第
使用帮助 返回顶部