Causal inference is a powerful modeling tool for explanatory analysis,which might enable current machine learning to become explainable.How to marry causal inference with machine learning to develop explainable artifi...Causal inference is a powerful modeling tool for explanatory analysis,which might enable current machine learning to become explainable.How to marry causal inference with machine learning to develop explainable artificial intelligence(XAI)algorithms is one of key steps toward to the artificial intelligence 2.0.With the aim of bringing knowledge of causal inference to scholars of machine learning and artificial intelligence,we invited researchers working on causal inference to write this survey from different aspects of causal inference.This survey includes the following sections:“Estimating average treatment effect:A brief review and beyond”from Dr.Kun Kuang,“Attribution problems in counterfactual inference”from Prof.Lian Li,“The Yule–Simpson paradox and the surrogate paradox”from Prof.Zhi Geng,“Causal potential theory”from Prof.Lei Xu,“Discovering causal information from observational data”from Prof.Kun Zhang,“Formal argumentation in causal reasoning and explanation”from Profs.Beishui Liao and Huaxin Huang,“Causal inference with complex experiments”from Prof.Peng Ding,“Instrumental variables and negative controls for observational studies”from Prof.Wang Miao,and“Causal inference with interference”from Dr.Zhichao Jiang.展开更多
Propensity score (PS) adjustment can control confounding effects and reduce bias when estimating treatment effects in non-randomized trials or observational studies. PS methods are becoming increasingly used to estima...Propensity score (PS) adjustment can control confounding effects and reduce bias when estimating treatment effects in non-randomized trials or observational studies. PS methods are becoming increasingly used to estimate causal effects, including when the sample size is small compared to the number of confounders. With numerous confounders, quasi-complete separation can easily occur in logistic regression used for estimating the PS, but this has not been addressed. We focused on a Bayesian PS method to address the limitations of quasi-complete separation faced by small trials. Bayesian methods are useful because they estimate the PS and causal effects simultaneously while considering the uncertainty of the PS by modelling it as a latent variable. In this study, we conducted simulations to evaluate the performance of Bayesian simultaneous PS estimation by considering the specification of prior distributions for model comparison. We propose a method to improve predictive performance with discrete outcomes in small trials. We found that the specification of prior distributions assigned to logistic regression coefficients was more important in the second step than in the first step, even when there was a quasi-complete separation in the first step. Assigning Cauchy (0, 2.5) to coefficients improved the predictive performance for estimating causal effects and improving the balancing properties of the confounder.展开更多
Aiming at the wind power prediction problem,a wind power probability prediction method based on the quantile regression of a dilated causal convolutional neural network is proposed.With the developed model,the Adam st...Aiming at the wind power prediction problem,a wind power probability prediction method based on the quantile regression of a dilated causal convolutional neural network is proposed.With the developed model,the Adam stochastic gradient descent technique is utilized to solve the cavity parameters of the causal convolutional neural network under different quantile conditions and obtain the probability density distribution of wind power at various times within the following 200 hours.The presented method can obtain more useful information than conventional point and interval predictions.Moreover,a prediction of the future complete probability distribution of wind power can be realized.According to the actual data forecast of wind power in the PJM network in the United States,the proposed probability density prediction approach can not only obtain more accurate point prediction results,it also obtains the complete probability density curve prediction results for wind power.Compared with two other quantile regression methods,the developed technique can achieve a higher accuracy and smaller prediction interval range under the same confidence level.展开更多
反事实预测和选择偏差是因果效应估计中的重大挑战。为对潜在协变量的复杂混杂分布进行有效表征,同时增强反事实预测泛化能力,提出一种面向工业因果效应估计应用的重加权对抗变分自编码器网络(RVAENet)模型。针对混杂分布去偏问题,借鉴...反事实预测和选择偏差是因果效应估计中的重大挑战。为对潜在协变量的复杂混杂分布进行有效表征,同时增强反事实预测泛化能力,提出一种面向工业因果效应估计应用的重加权对抗变分自编码器网络(RVAENet)模型。针对混杂分布去偏问题,借鉴域适应思想,采用对抗学习机制对由变分自编码器(VAE)获得的隐含变量进行表示学习的分布平衡;在此基础上,通过学习样本倾向性权重对样本进行重加权,进一步缩小实验组(Treatment)与对照组(Control)样本间的分布差异。实验结果表明,在工业真实场景数据集的两个场景下,所提模型的提升曲线下的面积(AUUC)比TEDVAE(Treatment Effect with Disentangled VAE)分别提升了15.02%、16.02%;在公开数据集上,所提模型的平均干预效果(ATE)和异构估计精度(PEHE)普遍取得最优结果。展开更多
文摘Causal inference is a powerful modeling tool for explanatory analysis,which might enable current machine learning to become explainable.How to marry causal inference with machine learning to develop explainable artificial intelligence(XAI)algorithms is one of key steps toward to the artificial intelligence 2.0.With the aim of bringing knowledge of causal inference to scholars of machine learning and artificial intelligence,we invited researchers working on causal inference to write this survey from different aspects of causal inference.This survey includes the following sections:“Estimating average treatment effect:A brief review and beyond”from Dr.Kun Kuang,“Attribution problems in counterfactual inference”from Prof.Lian Li,“The Yule–Simpson paradox and the surrogate paradox”from Prof.Zhi Geng,“Causal potential theory”from Prof.Lei Xu,“Discovering causal information from observational data”from Prof.Kun Zhang,“Formal argumentation in causal reasoning and explanation”from Profs.Beishui Liao and Huaxin Huang,“Causal inference with complex experiments”from Prof.Peng Ding,“Instrumental variables and negative controls for observational studies”from Prof.Wang Miao,and“Causal inference with interference”from Dr.Zhichao Jiang.
文摘Propensity score (PS) adjustment can control confounding effects and reduce bias when estimating treatment effects in non-randomized trials or observational studies. PS methods are becoming increasingly used to estimate causal effects, including when the sample size is small compared to the number of confounders. With numerous confounders, quasi-complete separation can easily occur in logistic regression used for estimating the PS, but this has not been addressed. We focused on a Bayesian PS method to address the limitations of quasi-complete separation faced by small trials. Bayesian methods are useful because they estimate the PS and causal effects simultaneously while considering the uncertainty of the PS by modelling it as a latent variable. In this study, we conducted simulations to evaluate the performance of Bayesian simultaneous PS estimation by considering the specification of prior distributions for model comparison. We propose a method to improve predictive performance with discrete outcomes in small trials. We found that the specification of prior distributions assigned to logistic regression coefficients was more important in the second step than in the first step, even when there was a quasi-complete separation in the first step. Assigning Cauchy (0, 2.5) to coefficients improved the predictive performance for estimating causal effects and improving the balancing properties of the confounder.
基金Supported by the National Natural Science Foundation of China(51777015)the Research Foundation of Education Bureau of Hunan Province(20A021).
文摘Aiming at the wind power prediction problem,a wind power probability prediction method based on the quantile regression of a dilated causal convolutional neural network is proposed.With the developed model,the Adam stochastic gradient descent technique is utilized to solve the cavity parameters of the causal convolutional neural network under different quantile conditions and obtain the probability density distribution of wind power at various times within the following 200 hours.The presented method can obtain more useful information than conventional point and interval predictions.Moreover,a prediction of the future complete probability distribution of wind power can be realized.According to the actual data forecast of wind power in the PJM network in the United States,the proposed probability density prediction approach can not only obtain more accurate point prediction results,it also obtains the complete probability density curve prediction results for wind power.Compared with two other quantile regression methods,the developed technique can achieve a higher accuracy and smaller prediction interval range under the same confidence level.
基金supported by the National Natural Science Foundation of China(72071187,11671374,71731010,71921001)Fundamental Research Funds for the Central Universities(WK3470000017,WK2040000027)。
文摘反事实预测和选择偏差是因果效应估计中的重大挑战。为对潜在协变量的复杂混杂分布进行有效表征,同时增强反事实预测泛化能力,提出一种面向工业因果效应估计应用的重加权对抗变分自编码器网络(RVAENet)模型。针对混杂分布去偏问题,借鉴域适应思想,采用对抗学习机制对由变分自编码器(VAE)获得的隐含变量进行表示学习的分布平衡;在此基础上,通过学习样本倾向性权重对样本进行重加权,进一步缩小实验组(Treatment)与对照组(Control)样本间的分布差异。实验结果表明,在工业真实场景数据集的两个场景下,所提模型的提升曲线下的面积(AUUC)比TEDVAE(Treatment Effect with Disentangled VAE)分别提升了15.02%、16.02%;在公开数据集上,所提模型的平均干预效果(ATE)和异构估计精度(PEHE)普遍取得最优结果。