摘要
该文以沪深300为投资标的池,选取波动指标、收益指标、经典技术指标和交易指标等4大类指标共24个二级指标作为评价因子,用递归特征消除法结合Stacking集成学习以及传统的随机森林、支持向量机和逻辑回归等4个机器学习算法分别构建分类模型,预测投资标的池中周频收益率排名前20%的股票标的,为投资者提供量化投资策略.实证研究对这4个模型的分类预测效果进行了比较.结果表明,基于递归特征消除法和Stacking集成学习的模型的预测性能最高,其AUC值达到0.6447,准确率为60.21%,精确率为59.87%,召回率为62.65%,F 1值为61.23%.因此,基于递归特征消除法和Stacking集成学习的模型能够有效地为投资者选取高收益率的投资标的,是一个可行的基于机器学习的量化投资策略.
In this paper we take Shanghai and Shenzhen 300 as the investment target pool,and select 24 secondary indexes as evaluation factors,including volatility,income,classical technical indexes and trading indicators.We respectively establish a classification model by recursive feature elimination method combined with stacking integrated learning(RFE_Stacking),by the traditional random forest,by support vector machine and by logical regression,in order to predict and select the stock targets with the top 20%weekly frequency yield in the investment target pool,and to provide investors with quantifiable investment strategies.Our empirical study compare the prediction effects of the four classification models,and the results show that RFE_Stacking behaves best,of which the AUC reaches 0.6447,the accuracy is 60.21%,the precision is 59.87%,the recall is 62.65%and the F 1-score is 61.23%.Therefore,the model based on RFE_Stacking can effectively select high-yield investment targets for investors,which is a feasible quantifiable investment strategy based on machine learning.
作者
黄秋丽
黄柱兴
杨燕
HUANG Qiu-li;HUANG Zhu-xing;YANG Yan(School of Mathematics and Statistics,Nanning Normal University,Nanning,530100,China)
出处
《南宁师范大学学报(自然科学版)》
2021年第3期37-43,共7页
Journal of Nanning Normal University:Natural Science Edition