摘要
Inferential models are widely used in the chemical industry to infer key process variables, which are challenging or expensive to measure, from other more easily measured variables. The aim of this paper is three-fold: to present a theoretical review of some of the well known linear inferential modeling techniques, to enhance the predictive ability of the regularized canonical correlation analysis (RCCA) method, and finally to compare the performances of these techniques and highlight some of the practical issues that can affect their predictive abilities. The inferential modeling techniques considered in this study include full rank modeling techniques, such as ordinary least square (OLS) regression and ridge regression (RR), and latent variable regression (LVR) techniques, such as principal component regression (PCR), partial least squares (PLS) regression, and regularized canonical correlation analysis (RCCA). The theoretical analysis shows that the loading vectors used in LVR modeling can be computed by solving eigenvalue problems. Also, for the RCCA method, we show that by optimizing the regularization parameter, an improvement in prediction accuracy can be achieved over other modeling techniques. To illustrate the performances of all inferential modeling techniques, a comparative analysis was performed through two simulated examples, one using synthetic data and the other using simulated distillation column data. All techniques are optimized and compared by computing the cross validation mean square error using unseen testing data. The results of this comparative analysis show that scaling the data helps improve the performances of all modeling techniques, and that the LVR techniques outperform the full rank ones. One reason for this advantage is that the LVR techniques improve the conditioning of the model by discarding the latent variables (or principal components) with small eigenvalues, which also reduce the effect of the noise on the model prediction. The results also show that PCR and PLS have compara
Inferential models are widely used in the chemical industry to infer key process variables, which are challenging or expensive to measure, from other more easily measured variables. The aim of this paper is three-fold: to present a theoretical review of some of the well known linear inferential modeling techniques, to enhance the predictive ability of the regularized canonical correlation analysis (RCCA) method, and finally to compare the performances of these techniques and highlight some of the practical issues that can affect their predictive abilities. The inferential modeling techniques considered in this study include full rank modeling techniques, such as ordinary least square (OLS) regression and ridge regression (RR), and latent variable regression (LVR) techniques, such as principal component regression (PCR), partial least squares (PLS) regression, and regularized canonical correlation analysis (RCCA). The theoretical analysis shows that the loading vectors used in LVR modeling can be computed by solving eigenvalue problems. Also, for the RCCA method, we show that by optimizing the regularization parameter, an improvement in prediction accuracy can be achieved over other modeling techniques. To illustrate the performances of all inferential modeling techniques, a comparative analysis was performed through two simulated examples, one using synthetic data and the other using simulated distillation column data. All techniques are optimized and compared by computing the cross validation mean square error using unseen testing data. The results of this comparative analysis show that scaling the data helps improve the performances of all modeling techniques, and that the LVR techniques outperform the full rank ones. One reason for this advantage is that the LVR techniques improve the conditioning of the model by discarding the latent variables (or principal components) with small eigenvalues, which also reduce the effect of the noise on the model prediction. The results also show that PCR and PLS have compara