The generalized linear model is an indispensable tool for analyzing non-Gaussian response data, with both canonical and non-canonical link functions comprehensively used. When missing values are present, many existing...The generalized linear model is an indispensable tool for analyzing non-Gaussian response data, with both canonical and non-canonical link functions comprehensively used. When missing values are present, many existing methods in the literature heavily depend on an unverifiable assumption of the missing data mechanism, and they fail when the assumption is violated. This paper proposes a missing data mechanism that is as generally applicable as possible, which includes both ignorable and nonignorable missing data cases, as well as both scenarios of missing values in response and covariate.Under this general missing data mechanism, the authors adopt an approximate conditional likelihood method to estimate unknown parameters. The authors rigorously establish the regularity conditions under which the unknown parameters are identifiable under the approximate conditional likelihood approach. For parameters that are identifiable, the authors prove the asymptotic normality of the estimators obtained by maximizing the approximate conditional likelihood. Some simulation studies are conducted to evaluate finite sample performance of the proposed estimators as well as estimators from some existing methods. Finally, the authors present a biomarker analysis in prostate cancer study to illustrate the proposed method.展开更多
The accuracy of the statistical learning model depends on the learning technique used which in turn depends on the dataset’s values.In most research studies,the existence of missing values(MVs)is a vital problem.In a...The accuracy of the statistical learning model depends on the learning technique used which in turn depends on the dataset’s values.In most research studies,the existence of missing values(MVs)is a vital problem.In addition,any dataset with MVs cannot be used for further analysis or with any data driven tool especially when the percentage of MVs are high.In this paper,the authors propose a novel algorithm for dealing with MVs depending on the feature selec-tion(FS)of similarity classifier with fuzzy entropy measure.The proposed algo-rithm imputes MVs in cumulative order.The candidate feature to be manipulated is selected using similarity classifier with Parkash’s fuzzy entropy measure.The predictive model to predict MVs within the candidate feature is the Bayesian Ridge Regression(BRR)technique.Furthermore,any imputed features will be incorporated within the BRR equation to impute the MVs in the next chosen incomplete feature.The proposed algorithm was compared against some practical state-of-the-art imputation methods by conducting an experiment on four medical datasets which were gathered from several databases repository with MVs gener-ated from the three missingness mechanisms.The evaluation metrics of mean abso-lute error(MAE),root mean square error(RMSE)and coefficient of determination(R2 score)were used to measure the performance.The results exhibited that perfor-mance vary depending on the size of the dataset,amount of MVs and the missing-ness mechanism type.Moreover,compared to other methods,the results showed that the proposed method gives better accuracy and less error in most cases.展开更多
We accurately reconstruct three-dimensional(3-D)refractive index(RI)distributions from highly ill-posed two-dimensional(2-D)measurements using a deep neural network(DNN).Strong distortions are introduced on reconstruc...We accurately reconstruct three-dimensional(3-D)refractive index(RI)distributions from highly ill-posed two-dimensional(2-D)measurements using a deep neural network(DNN).Strong distortions are introduced on reconstructions obtained by the Wolf transform inversion method due to the ill-posed measurements acquired from the limited numerical apertures(NAs)of the optical system.Despite the recent success of DNNs in solving ill-posed inverse problems,the application to 3-D optical imaging is particularly challenging due to the lack of the ground truth.We overcome this limitation by generating digital phantoms that serve as samples for the discrete dipole approximation(DDA)to generate multiple 2-D projection maps for a limited range of illumination angles.The presented samples are red blood cells(RBCs),which are highly affected by the ill-posed problems due to their morphology.The trained network using synthetic measurements from the digital phantoms successfully eliminates the introduced distortions.Most importantly,we obtain high fidelity reconstructions from experimentally recorded projections of real RBC sample using the network that was trained on digitally generated RBC phantoms.Finally,we confirm the reconstruction accuracy using the DDA to calculate the 2-D projections of the 3-D reconstructions and compare them to the experimentally recorded projections.展开更多
Irregular seismic data causes problems with multi-trace processing algorithms and degrades processing quality. We introduce the Projection onto Convex Sets (POCS) based image restoration method into the seismic data...Irregular seismic data causes problems with multi-trace processing algorithms and degrades processing quality. We introduce the Projection onto Convex Sets (POCS) based image restoration method into the seismic data reconstruction field to interpolate irregularly missing traces. For entire dead traces, we transfer the POCS iteration reconstruction process from the time to frequency domain to save computational cost because forward and reverse Fourier time transforms are not needed. In each iteration, the selection threshold parameter is important for reconstruction efficiency. In this paper, we designed two types of threshold models to reconstruct irregularly missing seismic data. The experimental results show that an exponential threshold can greatly reduce iterations and improve reconstruction efficiency compared to a linear threshold for the same reconstruction result. We also analyze the anti- noise and anti-alias ability of the POCS reconstruction method. Finally, theoretical model tests and real data examples indicate that the proposed method is efficient and applicable.展开更多
The main function of electronic support measure system is to detect threating signals in order to take countermeasures against them. To accomplish this objective, a process of associating each interleaved pulse with i...The main function of electronic support measure system is to detect threating signals in order to take countermeasures against them. To accomplish this objective, a process of associating each interleaved pulse with its emitter must be done. This process is termed sorting or de-interleaving. A novel point symmetry based radar sorting (PSBRS) algorithm is addressed. In order to deal with all kinds of radar signals, the symmetry measure distance is used to cluster pulses instead of the conventional Euclidean distance. The reference points of the symmetrical clusters are initialized by the alternative fuzzy c-means (AFCM) algorithm to ameliorate the effects of noise and the false sorting. Besides, the density filtering (DF) algorithm is proposed to discard the noise pulses or clutter. The performance of the algorithm is evaluated under the effects of noise and missing pulses. It has been observed that the PSBRS algorithm can cope with a large number of noise pulses and it is completely independent of missing pulses. Finally, PSBRS is compared with some benchmark algorithms, and the simulation results reveal the feasibility and efficiency of the algorithm.展开更多
Investment for renewables has been growing rapidly since the beginning of the new century, and the momentum is expected to sustain in order to mitigate the impact of anthropogenic climate change.Transition towards hig...Investment for renewables has been growing rapidly since the beginning of the new century, and the momentum is expected to sustain in order to mitigate the impact of anthropogenic climate change.Transition towards higher renewable penetration in the power industry will not only confront technical challenges, but also face socio-economic obstacles.The connected between environment and energy systems are also tightened under elevated penetration of renewables.This paper will provide an overview of some important challenges related to technical, environmental and socio-economic aspects at elevated renewable penetration.An integrated analytical framework for interlinked technical, environmental and socio-economic systems will be presented at the end.展开更多
基金supported by the Chinese 111 Project B14019the US National Science Foundation under Grant Nos.DMS-1305474 and DMS-1612873the US National Institutes of Health Award UL1TR001412
文摘The generalized linear model is an indispensable tool for analyzing non-Gaussian response data, with both canonical and non-canonical link functions comprehensively used. When missing values are present, many existing methods in the literature heavily depend on an unverifiable assumption of the missing data mechanism, and they fail when the assumption is violated. This paper proposes a missing data mechanism that is as generally applicable as possible, which includes both ignorable and nonignorable missing data cases, as well as both scenarios of missing values in response and covariate.Under this general missing data mechanism, the authors adopt an approximate conditional likelihood method to estimate unknown parameters. The authors rigorously establish the regularity conditions under which the unknown parameters are identifiable under the approximate conditional likelihood approach. For parameters that are identifiable, the authors prove the asymptotic normality of the estimators obtained by maximizing the approximate conditional likelihood. Some simulation studies are conducted to evaluate finite sample performance of the proposed estimators as well as estimators from some existing methods. Finally, the authors present a biomarker analysis in prostate cancer study to illustrate the proposed method.
基金funded by the Deanship of Scientific Research(DSR)at King Abdulaziz University(KAU)Jeddah,Saudi Arabia,under grant No.(PH:13-130-1442).
文摘The accuracy of the statistical learning model depends on the learning technique used which in turn depends on the dataset’s values.In most research studies,the existence of missing values(MVs)is a vital problem.In addition,any dataset with MVs cannot be used for further analysis or with any data driven tool especially when the percentage of MVs are high.In this paper,the authors propose a novel algorithm for dealing with MVs depending on the feature selec-tion(FS)of similarity classifier with fuzzy entropy measure.The proposed algo-rithm imputes MVs in cumulative order.The candidate feature to be manipulated is selected using similarity classifier with Parkash’s fuzzy entropy measure.The predictive model to predict MVs within the candidate feature is the Bayesian Ridge Regression(BRR)technique.Furthermore,any imputed features will be incorporated within the BRR equation to impute the MVs in the next chosen incomplete feature.The proposed algorithm was compared against some practical state-of-the-art imputation methods by conducting an experiment on four medical datasets which were gathered from several databases repository with MVs gener-ated from the three missingness mechanisms.The evaluation metrics of mean abso-lute error(MAE),root mean square error(RMSE)and coefficient of determination(R2 score)were used to measure the performance.The results exhibited that perfor-mance vary depending on the size of the dataset,amount of MVs and the missing-ness mechanism type.Moreover,compared to other methods,the results showed that the proposed method gives better accuracy and less error in most cases.
文摘We accurately reconstruct three-dimensional(3-D)refractive index(RI)distributions from highly ill-posed two-dimensional(2-D)measurements using a deep neural network(DNN).Strong distortions are introduced on reconstructions obtained by the Wolf transform inversion method due to the ill-posed measurements acquired from the limited numerical apertures(NAs)of the optical system.Despite the recent success of DNNs in solving ill-posed inverse problems,the application to 3-D optical imaging is particularly challenging due to the lack of the ground truth.We overcome this limitation by generating digital phantoms that serve as samples for the discrete dipole approximation(DDA)to generate multiple 2-D projection maps for a limited range of illumination angles.The presented samples are red blood cells(RBCs),which are highly affected by the ill-posed problems due to their morphology.The trained network using synthetic measurements from the digital phantoms successfully eliminates the introduced distortions.Most importantly,we obtain high fidelity reconstructions from experimentally recorded projections of real RBC sample using the network that was trained on digitally generated RBC phantoms.Finally,we confirm the reconstruction accuracy using the DDA to calculate the 2-D projections of the 3-D reconstructions and compare them to the experimentally recorded projections.
基金financially supported by National 863 Program (Grants No.2006AA 09A 102-09)National Science and Technology of Major Projects ( Grants No.2008ZX0 5025-001-001)
文摘Irregular seismic data causes problems with multi-trace processing algorithms and degrades processing quality. We introduce the Projection onto Convex Sets (POCS) based image restoration method into the seismic data reconstruction field to interpolate irregularly missing traces. For entire dead traces, we transfer the POCS iteration reconstruction process from the time to frequency domain to save computational cost because forward and reverse Fourier time transforms are not needed. In each iteration, the selection threshold parameter is important for reconstruction efficiency. In this paper, we designed two types of threshold models to reconstruct irregularly missing seismic data. The experimental results show that an exponential threshold can greatly reduce iterations and improve reconstruction efficiency compared to a linear threshold for the same reconstruction result. We also analyze the anti- noise and anti-alias ability of the POCS reconstruction method. Finally, theoretical model tests and real data examples indicate that the proposed method is efficient and applicable.
基金supported by the National Natural Science Foundation of China(61172116)
文摘The main function of electronic support measure system is to detect threating signals in order to take countermeasures against them. To accomplish this objective, a process of associating each interleaved pulse with its emitter must be done. This process is termed sorting or de-interleaving. A novel point symmetry based radar sorting (PSBRS) algorithm is addressed. In order to deal with all kinds of radar signals, the symmetry measure distance is used to cluster pulses instead of the conventional Euclidean distance. The reference points of the symmetrical clusters are initialized by the alternative fuzzy c-means (AFCM) algorithm to ameliorate the effects of noise and the false sorting. Besides, the density filtering (DF) algorithm is proposed to discard the noise pulses or clutter. The performance of the algorithm is evaluated under the effects of noise and missing pulses. It has been observed that the PSBRS algorithm can cope with a large number of noise pulses and it is completely independent of missing pulses. Finally, PSBRS is compared with some benchmark algorithms, and the simulation results reveal the feasibility and efficiency of the algorithm.
基金supported by Harvard Global Institute and Ash Center at Harvard Kennedy School of governmentsupported by State Key Laboratory on Smart Grid Protection and Operation Control of NARI Group Corporation (No.20171613)
文摘Investment for renewables has been growing rapidly since the beginning of the new century, and the momentum is expected to sustain in order to mitigate the impact of anthropogenic climate change.Transition towards higher renewable penetration in the power industry will not only confront technical challenges, but also face socio-economic obstacles.The connected between environment and energy systems are also tightened under elevated penetration of renewables.This paper will provide an overview of some important challenges related to technical, environmental and socio-economic aspects at elevated renewable penetration.An integrated analytical framework for interlinked technical, environmental and socio-economic systems will be presented at the end.