为解决当前电能计量装置异常数据识别范围较小、识别速率低的问题,提出一种基于Logistic算法的电能计量装置异常数据识别方法。通过提取异常数据识别特征,采用多目标的方式,扩大数据识别范围。设计奇异值分解(Singular Value Decomposit...为解决当前电能计量装置异常数据识别范围较小、识别速率低的问题,提出一种基于Logistic算法的电能计量装置异常数据识别方法。通过提取异常数据识别特征,采用多目标的方式,扩大数据识别范围。设计奇异值分解(Singular Value Decomposition,SVD)+多目标快速辨识矩阵,构建Logistic测算电能计量装置异常数据识别模型,采用多区间边界修正实现数据识别处理。测试结果表明,此次所设计的基于Logistic算法的电能计量装置异常数据识别方法的识别速率均可以达到6 Mb/s以上,即所设计方法的异常数据识别效果更佳,误差可控,具有较好的实际应用价值。展开更多
There are many factors influencing personal credit. We introduce Lasso technique to personal credit evaluation, and establish Lasso-logistic, Lasso-SVM and Group lasso-logistic models respectively. Variable selection ...There are many factors influencing personal credit. We introduce Lasso technique to personal credit evaluation, and establish Lasso-logistic, Lasso-SVM and Group lasso-logistic models respectively. Variable selection and parameter estimation are also conducted simultaneously. Based on the personal credit data set from a certain lending platform, it can be concluded through experiments that compared with the full-variable Logistic model and the stepwise Logistic model, the variable selection ability of Group lasso-logistic model was the strongest, followed by Lasso-logistic and Lasso-SVM respectively. All three models based on Lasso variable selection have better filtering capability than stepwise selection. In the meantime, the Group lasso-logistic model can eliminate or retain relevant virtual variables as a group to facilitate model interpretation. In terms of prediction accuracy, Lasso-SVM had the highest prediction accuracy for default users in the training set, while in the test set, Group lasso-logistic had the best classification accuracy for default users. Whether in the training set or in the test set, the Lasso-logistic model has the best classification accuracy for non-default users. The model based on Lasso variable selection can also better screen out the key factors influencing personal credit risk.展开更多
目的评价Logistic回归算法和随机森林算法对2型糖尿病患者3个月后血糖控制情况的预测效果,并探究血糖控制的影响因素。方法收集顺义、通州区2型糖尿病患者的基线调查和随访信息,以患者3个月后糖化血红蛋白是否大于6.5%作为结局分类变量...目的评价Logistic回归算法和随机森林算法对2型糖尿病患者3个月后血糖控制情况的预测效果,并探究血糖控制的影响因素。方法收集顺义、通州区2型糖尿病患者的基线调查和随访信息,以患者3个月后糖化血红蛋白是否大于6.5%作为结局分类变量,使用随机森林算法和Logistic算法建立预测模型,通过受试者工作特征曲线下面积(area under the curve,AUC)、灵敏度等指标比较预测效果。结果患者血糖控制效果的影响因素有基线空腹血糖(P<0.001)、病程(P<0.001)、吸烟(P=0.026)、静态活动时间(P=0.006)、体重指数(超重P=0.002,肥胖P=0.011)、手环使用(P=0.028)和糖尿病饮食(P=0.002)7个因素;Logistic回归预测模型的AUC为0.738,灵敏度为72.9%,特异度68.1%,准确率71.2%,随机森林模型的AUC为0.756,灵敏度74.5%,特异度69.5%,准确率72.8%。结论随机森林算法预测效果优于Logistic回归预测模型,可应用于血糖控制效果预测,辅助糖尿病患者的管理。展开更多
This research introduces a novel approach to improve and optimize the predictive capacity of consumer purchase behaviors on e-commerce platforms. This study presented an introduction to the fundamental concepts of the...This research introduces a novel approach to improve and optimize the predictive capacity of consumer purchase behaviors on e-commerce platforms. This study presented an introduction to the fundamental concepts of the logistic regression algorithm. In addition, it analyzed user data obtained from an e-commerce platform. The original data were preprocessed, and a consumer purchase prediction model was developed for the e-commerce platform using the logistic regression method. The comparison study used the classic random forest approach, further enhanced by including the K-fold cross-validation method. Evaluation of the accuracy of the model’s classification was conducted using performance indicators that included the accuracy rate, the precision rate, the recall rate, and the F1 score. A visual examination determined the significance of the findings. The findings suggest that employing the logistic regression algorithm to forecast customer purchase behaviors on e-commerce platforms can improve the efficacy of the approach and yield more accurate predictions. This study serves as a valuable resource for improving the precision of forecasting customers’ purchase behaviors on e-commerce platforms. It has significant practical implications for optimizing the operational efficiency of e-commerce platforms.展开更多
文摘为解决当前电能计量装置异常数据识别范围较小、识别速率低的问题,提出一种基于Logistic算法的电能计量装置异常数据识别方法。通过提取异常数据识别特征,采用多目标的方式,扩大数据识别范围。设计奇异值分解(Singular Value Decomposition,SVD)+多目标快速辨识矩阵,构建Logistic测算电能计量装置异常数据识别模型,采用多区间边界修正实现数据识别处理。测试结果表明,此次所设计的基于Logistic算法的电能计量装置异常数据识别方法的识别速率均可以达到6 Mb/s以上,即所设计方法的异常数据识别效果更佳,误差可控,具有较好的实际应用价值。
文摘There are many factors influencing personal credit. We introduce Lasso technique to personal credit evaluation, and establish Lasso-logistic, Lasso-SVM and Group lasso-logistic models respectively. Variable selection and parameter estimation are also conducted simultaneously. Based on the personal credit data set from a certain lending platform, it can be concluded through experiments that compared with the full-variable Logistic model and the stepwise Logistic model, the variable selection ability of Group lasso-logistic model was the strongest, followed by Lasso-logistic and Lasso-SVM respectively. All three models based on Lasso variable selection have better filtering capability than stepwise selection. In the meantime, the Group lasso-logistic model can eliminate or retain relevant virtual variables as a group to facilitate model interpretation. In terms of prediction accuracy, Lasso-SVM had the highest prediction accuracy for default users in the training set, while in the test set, Group lasso-logistic had the best classification accuracy for default users. Whether in the training set or in the test set, the Lasso-logistic model has the best classification accuracy for non-default users. The model based on Lasso variable selection can also better screen out the key factors influencing personal credit risk.
文摘目的评价Logistic回归算法和随机森林算法对2型糖尿病患者3个月后血糖控制情况的预测效果,并探究血糖控制的影响因素。方法收集顺义、通州区2型糖尿病患者的基线调查和随访信息,以患者3个月后糖化血红蛋白是否大于6.5%作为结局分类变量,使用随机森林算法和Logistic算法建立预测模型,通过受试者工作特征曲线下面积(area under the curve,AUC)、灵敏度等指标比较预测效果。结果患者血糖控制效果的影响因素有基线空腹血糖(P<0.001)、病程(P<0.001)、吸烟(P=0.026)、静态活动时间(P=0.006)、体重指数(超重P=0.002,肥胖P=0.011)、手环使用(P=0.028)和糖尿病饮食(P=0.002)7个因素;Logistic回归预测模型的AUC为0.738,灵敏度为72.9%,特异度68.1%,准确率71.2%,随机森林模型的AUC为0.756,灵敏度74.5%,特异度69.5%,准确率72.8%。结论随机森林算法预测效果优于Logistic回归预测模型,可应用于血糖控制效果预测,辅助糖尿病患者的管理。
文摘This research introduces a novel approach to improve and optimize the predictive capacity of consumer purchase behaviors on e-commerce platforms. This study presented an introduction to the fundamental concepts of the logistic regression algorithm. In addition, it analyzed user data obtained from an e-commerce platform. The original data were preprocessed, and a consumer purchase prediction model was developed for the e-commerce platform using the logistic regression method. The comparison study used the classic random forest approach, further enhanced by including the K-fold cross-validation method. Evaluation of the accuracy of the model’s classification was conducted using performance indicators that included the accuracy rate, the precision rate, the recall rate, and the F1 score. A visual examination determined the significance of the findings. The findings suggest that employing the logistic regression algorithm to forecast customer purchase behaviors on e-commerce platforms can improve the efficacy of the approach and yield more accurate predictions. This study serves as a valuable resource for improving the precision of forecasting customers’ purchase behaviors on e-commerce platforms. It has significant practical implications for optimizing the operational efficiency of e-commerce platforms.