期刊文献+

基于机器学习的GitHub企业影响力分析与预测

Analysis and prediction of GitHub company influence based on machine learning
下载PDF
导出
摘要 企业影响力的高低不仅关系到其行业竞争力,也影响着其社会声誉和未来发展,然而对企业影响力的评价一直没有统一的标准。GitHub是一个代表性的软件开发代码存储库开源平台,现有研究通常使用企业在GitHub发布的项目得到的star总数衡量其影响力高低,但是这种方式难以衡量小微企业和新生企业的潜力。通过引入科学家的影响力衡量指标h指数,以GitHub为信息源进行企业网络建模,同时基于该网络提取特征构建分类器,对企业未来的影响力水平进行预测。在此基础上应用SHAP模型解释技术,判别决定企业影响力的重要特征。实验结果显示,基于XGBoost的模型在GitHub真实数据集上实现了0.92的准确率和0.93的平均AUC,可以准确、可靠地对企业进行影响力预测。 The influence of a company is not only related to its industry competitiveness,but also affects its public reputation and future development.However,there has been no unified standard for evaluating the influence of a company.GitHub is a representative open-source platform for software development code repositories.Existing research typically used the total number of stars a company receives for projects posted on GitHub to measure its influence,but this approach is difficult to measure the potential of small,micro,and nascent companies.The paper predicted the future influence level of a company by introducing the scientist's influence measure h-index,using GitHub as the information source,and modeling the company network.Features was extracted features based on this network to build the classifier,which predicted the future influence level of the company.The SHAP model explanation technique was further applied on this basis to identify the important features that determined the influence of a company.The experimental results showed that the XGBoost model achieved an accuracy of 0.92 and an average AUC of 0.93 on the real-world GitHub dataset.In summary,the proposed method could accurately and reliably predict the influence of companies.
作者 王明宇 宫庆媛 瞿晶晶 王新 WANG Mingyu;GONG Qingyuan;QU Jingjing;WANG Xin(School of Computer Science,Fudan University,Shanghai 200438,China;Research Institute of Intelligent Complex Systems,Fudan University,Shanghai 200438,China;Shanghai Artificial Intelligent Laboratory,Shanghai 201210,China)
出处 《智能科学与技术学报》 CSCD 2023年第3期330-342,共13页 Chinese Journal of Intelligent Science and Technology
基金 国家自然科学基金项目(No.62102094)。
关键词 在线开发者社区 社交网络 机器学习 SHAP online developer community social network machine learning SHAP
  • 相关文献

参考文献6

二级参考文献14

共引文献149

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部