Imaging logging has become a popular means of well logging because it can visually represent the lithologic and structural characteristics of strata.The manual interpretation of imaging logging is affected by the limi...Imaging logging has become a popular means of well logging because it can visually represent the lithologic and structural characteristics of strata.The manual interpretation of imaging logging is affected by the limitations of the naked eye and experiential factors.As a result,manual interpretation accuracy is low.Therefore,it is highly useful to develop effective automatic imaging logging interpretation by machine learning.Resistivity imaging logging is the most widely used technology for imaging logging.In this paper,we propose an automatic extraction procedure for the geological features in resistivity imaging logging images.This procedure is based on machine learning and achieves good results in practical applications.Acknowledging that the existence of valueless data significantly affects the recognition effect,we propose three strategies for the identification of valueless data based on binary classification.We compare the effect of the three strategies both on an experimental dataset and in a production environment,and find that the merging method is the best performing of the three strategies.It effectively identifies the valueless data in the well logging images,thus significantly improving the automatic recognition effect of geological features in resistivity logging images.展开更多
Since the efficiency of photovoltaic(PV) power is closely related to the weather,many PV enterprises install weather instruments to monitor the working state of the PV power system.With the development of the soft mea...Since the efficiency of photovoltaic(PV) power is closely related to the weather,many PV enterprises install weather instruments to monitor the working state of the PV power system.With the development of the soft measurement technology,the instrumental method seems obsolete and involves high cost.This paper proposes a novel method for predicting the types of weather based on the PV power data and partial meteorological data.By this method,the weather types are deduced by data analysis,instead of weather instrument A better fault detection is obtained by using the support vector machines(SVM) and comparing the predicted and the actual weather.The model of the weather prediction is established by a direct SVM for training multiclass predictors.Although SVM is suitable for classification,the classified results depend on the type of the kernel,the parameters of the kernel,and the soft margin coefficient,which are difficult to choose.In this paper,these parameters are optimized by particle swarm optimization(PSO) algorithm in anticipation of good prediction results can be achieved.Prediction results show that this method is feasible and effective.展开更多
tmbalanced data is a common and serious problem in many biomedical classification tasks. It causes a bias on the training of classifiers and results in lower accuracy of minority classes prediction. This problem has a...tmbalanced data is a common and serious problem in many biomedical classification tasks. It causes a bias on the training of classifiers and results in lower accuracy of minority classes prediction. This problem has attracted a lot of research interests in the past decade. Unfortunately, most research efforts only concentrate on 2-class problems. In this paper, we study a new method of formulating a multiclass Support Vector Machine (SVM) problem for imbalanced biomedical data to improve the classification performance. The proposed method applies cost-sensitive approach and ramp loss function to the Crammer and Singer multiclass SVM formulation. Experimental results on multiple biomedical datasets show that the proposed solution can effectively cure the problem when the datasets are noisy and highly imbalanced.展开更多
It is common for datasets to contain both categorical and continuous variables. However, many feature screening methods designed for high-dimensional classification assume that the variables are continuous. This limit...It is common for datasets to contain both categorical and continuous variables. However, many feature screening methods designed for high-dimensional classification assume that the variables are continuous. This limits the applicability of existing methods in handling this complex scenario. To address this issue, we propose a model-free feature screening approach for ultra-high-dimensional multi-classification that can handle both categorical and continuous variables. Our proposed feature screening method utilizes the Maximal Information Coefficient to assess the predictive power of the variables. By satisfying certain regularity conditions, we have proven that our screening procedure possesses the sure screening property and ranking consistency properties. To validate the effectiveness of our approach, we conduct simulation studies and provide real data analysis examples to demonstrate its performance in finite samples. In summary, our proposed method offers a solution for effectively screening features in ultra-high-dimensional datasets with a mixture of categorical and continuous covariates.展开更多
In this study, salting-out assisted liquid-liquid extraction combined with high performance liquid chromatography diode array detector (SALLE-HPLC-DAD) method was developed and validated for simultaneous analysis of c...In this study, salting-out assisted liquid-liquid extraction combined with high performance liquid chromatography diode array detector (SALLE-HPLC-DAD) method was developed and validated for simultaneous analysis of carbaryl, atrazine, propazine, chlorothalonil, dimethametryn and terbutryn in environmental water samples. Parameters affecting the extraction efficiency such as type and volume of extraction solvent, sample volume, salt type and amount, centrifugation speed and time, and sample pH were optimized. Under the optimum extraction conditions the method was linear over the range of 10 - 100 μg/L (carbaryl), 8 - 100 μg/L (atarzine), 7 - 100 μg/L (propazine) and 9 - 100 μg/L (chlorothalonil, terbutryn and dimethametryn) with correlation coefficients (R2) between 0.99 and 0.999. Limits of detection and quantification ranged from 2.0 to 2.8 μg/L and 6.7 to 9.5 μg/L, respectively. The extraction recoveries obtained for ground, lake and river waters were in a range of 75.5% to 106.6%, with the intra-day and inter-day relative standard deviation lower than 3.4% for all the target analytes. All of the target analytes were not detected in these samples. Therefore, the proposed SALLE-HPLC-DAD method is simple, rapid, cheap and environmentally friendly for the determination of the aforementioned herbicides, insecticide and fungicide residues in environmental water samples.展开更多
Traffic congestion problem is one of the major problems that face many transportation decision makers for urban areas. The problem has many impacts on social, economical and development aspects of urban areas. Hence t...Traffic congestion problem is one of the major problems that face many transportation decision makers for urban areas. The problem has many impacts on social, economical and development aspects of urban areas. Hence the solution to this problem is not straight forward. It requires a lot of effort, expertise, time and cost that sometime are not available. Most of the existing transportation planning software, specially the most advanced ones, requires personnel with lots practical transportation planning experience and with high level of education and training. In this paper we propose a comprehensive framework for an Intelligent Decision Support System (IDSS) for Traffic Congestion Management System that utilizes a state of the art transportation network equilibrium modeling and providing an easy to use GIS-based interaction environment. The developed IDSS reduces the dependability on the expertise and level of education of the transportation planners, transportation engineers, or any transportation decision makers.展开更多
Digital display instrument identification is a crucial approach for automating the collection of digital display data.In this study,we propose a digital display area detection CTPNpro algorithm to address the problem ...Digital display instrument identification is a crucial approach for automating the collection of digital display data.In this study,we propose a digital display area detection CTPNpro algorithm to address the problem of recognizing multiclass digital display instruments.We developed a multiclass digital display instrument recognition algorithm by combining the character recognition network constructed using a convolutional neural network and bidirectional variable-length long short-term memory(LSTM).First,the digital display region detection CTPNpro network framework was designed based on the CTPN network architecture by introducing feature fusion and residual structure.Next,the digital display instrument identification network was constructed based on a convolutional neural network using twoway LSTM and Connectionist temporal classification(CTC)of indefinite length.Finally,an automatic calibration system for digital display instruments was built,and a multiclass digital display instrument dataset was constructed by sampling in the system.We compared the performance of the CTPNpro algorithm with other methods using this dataset to validate the effectiveness and robustness of the proposed algorithm.展开更多
Quantum computing is a promising new approach to tackle the complex real-world computational problems by harnessing the power of quantum mechanics principles.The inherent parallelism and exponential computational powe...Quantum computing is a promising new approach to tackle the complex real-world computational problems by harnessing the power of quantum mechanics principles.The inherent parallelism and exponential computational power of quantum systems hold the potential to outpace classical counterparts in solving complex optimization problems,which are pervasive in machine learning.Quantum Support Vector Machine(QSVM)is a quantum machine learning algorithm inspired by classical Support Vector Machine(SVM)that exploits quantum parallelism to efficiently classify data points in high-dimensional feature spaces.We provide a comprehensive overview of the underlying principles of QSVM,elucidating how different quantum feature maps and quantum kernels enable the manipulation of quantum states to perform classification tasks.Through a comparative analysis,we reveal the quantum advantage achieved by these algorithms in terms of speedup and solution quality.As a case study,we explored the potential of quantum paradigms in the context of a real-world problem:classifying pancreatic cancer biomarker data.The Support Vector Classifier(SVC)algorithm was employed for the classical approach while the QSVM algorithm was executed on a quantum simulator provided by the Qiskit quantum computing framework.The classical approach as well as the quantum-based techniques reported similar accuracy.This uniformity suggests that these methods effectively captured similar underlying patterns in the dataset.Remarkably,quantum implementations exhibited substantially reduced execution times demonstrating the potential of quantum approaches in enhancing classification efficiency.This affirms the growing significance of quantum computing as a transformative tool for augmenting machine learning paradigms and also underscores the potency of quantum execution for computational acceleration.展开更多
Botnets based on the Domain Generation Algorithm(DGA) mechanism pose great challenges to the main current detection methods because of their strong concealment and robustness. However, the complexity of the DGA family...Botnets based on the Domain Generation Algorithm(DGA) mechanism pose great challenges to the main current detection methods because of their strong concealment and robustness. However, the complexity of the DGA family and the imbalance of samples continue to impede research on DGA detection. In the existing work, the sample size of each DGA family is regarded as the most important determinant of the resampling proportion;thus,differences in the characteristics of various samples are ignored, and the optimal resampling effect is not achieved.In this paper, a Long Short-Term Memory-based Property and Quantity Dependent Optimization(LSTM.PQDO)method is proposed. This method takes advantage of LSTM to automatically mine the comprehensive features of DGA domain names. It iterates the resampling proportion with the optimal solution based on a comprehensive consideration of the original number and characteristics of the samples to heuristically search for a better solution around the initial solution in the right direction;thus, dynamic optimization of the resampling proportion is realized.The experimental results show that the LSTM.PQDO method can achieve better performance compared with existing models to overcome the difficulties of unbalanced datasets;moreover, it can function as a reference for sample resampling tasks in similar scenarios.展开更多
Feature extraction is the most critical step in classification of multispectral image.The classification accuracy is mainly influenced by the feature sets that are selected to classify the image.In the past,handcrafte...Feature extraction is the most critical step in classification of multispectral image.The classification accuracy is mainly influenced by the feature sets that are selected to classify the image.In the past,handcrafted feature sets are used which are not adaptive for different image domains.To overcome this,an evolu-tionary learning method is developed to automatically learn the spatial-spectral features for classification.A modified Firefly Algorithm(FA)which achieves maximum classification accuracy with reduced size of feature set is proposed to gain the interest of feature selection for this purpose.For extracting the most effi-cient features from the data set,we have used 3-D discrete wavelet transform which decompose the multispectral image in all three dimensions.For selecting spatial and spectral features we have studied three different approaches namely overlapping window(OW-3DFS),non-overlapping window(NW-3DFS)adaptive window cube(AW-3DFS)and Pixel based technique.Fivefold Multiclass Support Vector Machine(MSVM)is used for classification purpose.Experiments con-ducted on Madurai LISS IV multispectral image exploited that the adaptive win-dow approach is used to increase the classification accuracy.展开更多
It is quite common that both categorical and continuous covariates appear in the data. But, most feature screening methods for ultrahigh-dimensional classification assume the covariates are continuous. And applicable ...It is quite common that both categorical and continuous covariates appear in the data. But, most feature screening methods for ultrahigh-dimensional classification assume the covariates are continuous. And applicable feature screening method is very limited;to handle this non-trivial situation, we propose a model-free feature screening for ultrahigh-dimensional multi-classification with both categorical and continuous covariates. The proposed feature screening method will be based on Gini impurity to evaluate the prediction power of covariates. Under certain regularity conditions, it is proved that the proposed screening procedure possesses the sure screening property and ranking consistency properties. We demonstrate the finite sample performance of the proposed procedure by simulation studies and illustrate using real data analysis.展开更多
Support vector machines (SVMs) are initially designed for binary classification. How to effectively extend them for multiclass classification is still an ongoing research topic. A multiclass classifier is constructe...Support vector machines (SVMs) are initially designed for binary classification. How to effectively extend them for multiclass classification is still an ongoing research topic. A multiclass classifier is constructed by combining SVM^light algorithm with directed acyclic graph SVM (DAGSVM) method, named DAGSVM^light A new method is proposed to select the working set which is identical to the working set selected by SVM^light approach. Experimental results indicate DAGSVM^light is competitive with DAGSMO. It is more suitable for practice use. It may be an especially useful tool for large-scale multiclass classification problems and lead to more widespread use of SVMs in the engineering community due to its good performance.展开更多
Head pose estimation has been considered an important and challenging task in computer vision. In this paper we propose a novel method to estimate head pose based on a deep convolutional neural network (DCNN) for 2D...Head pose estimation has been considered an important and challenging task in computer vision. In this paper we propose a novel method to estimate head pose based on a deep convolutional neural network (DCNN) for 2D face images. We design an effective and simple method to roughly crop the face from the input image, maintaining the individual-relative facial features ratio. The method can be used in various poses. Then two convolutional neural networks are set up to train the head pose classifier and then compared with each other. The simpler one has six layers. It performs well on seven yaw poses but is somewhat unsatisfactory when mixed in two pitch poses. The other has eight layers and more pixels in input layers. It has better performance on more poses and more training samples. Before training the network, two reasonable strategies including shift and zoom are executed to prepare training samples. Finally, feature extraction filters are optimized together with the weight of the classification component through training, to minimize the classification error. Our method has been evaluated on the CAS-PEAL-R1, CMU PIE, and CUBIC FacePix databases. It has better performance than state-of-the-art methods for head pose estimation.展开更多
While the usage of digital ocular fundus image has been widespread in ophthalmology practice,the interpretation of the image has been still on the hands of the ophthalmologists which are quite costly.We explored a rob...While the usage of digital ocular fundus image has been widespread in ophthalmology practice,the interpretation of the image has been still on the hands of the ophthalmologists which are quite costly.We explored a robust deep learning system that detects three major ocular diseases:diabetic retinopathy(DR),glaucoma(GLC),and age-related macular degeneration(AMD).The proposed method is composed of two steps.First,an initial quality evaluation in the classification system is proposed to filter out poorquality images to enhance its performance,a technique that has not been explored previously.Second,the transfer learning technique is used with various convolutional neural networks(CNN)models that automatically learn a thousand features in the digital retinal image,and are based on those features for diagnosing eye diseases.Comparison performance of many models is conducted to find the optimal model which fits with fundus classification.Among the different CNN models,DenseNet-201 outperforms others with an area under the receiver operating characteristic curve of 0.99.Furthermore,the corresponding specificities for healthy,DR,GLC,andAMDpatients are found to be 89.52%,96.69%,89.58%,and 100%,respectively.These results demonstrate that the proposed method can reduce the time-consumption by automatically diagnosing multiple eye diseases using computer-aided assistance tools.展开更多
文摘Imaging logging has become a popular means of well logging because it can visually represent the lithologic and structural characteristics of strata.The manual interpretation of imaging logging is affected by the limitations of the naked eye and experiential factors.As a result,manual interpretation accuracy is low.Therefore,it is highly useful to develop effective automatic imaging logging interpretation by machine learning.Resistivity imaging logging is the most widely used technology for imaging logging.In this paper,we propose an automatic extraction procedure for the geological features in resistivity imaging logging images.This procedure is based on machine learning and achieves good results in practical applications.Acknowledging that the existence of valueless data significantly affects the recognition effect,we propose three strategies for the identification of valueless data based on binary classification.We compare the effect of the three strategies both on an experimental dataset and in a production environment,and find that the merging method is the best performing of the three strategies.It effectively identifies the valueless data in the well logging images,thus significantly improving the automatic recognition effect of geological features in resistivity logging images.
基金supported by the National Natural Science Foundation of China(61433004,61473069)IAPI Fundamental Research Funds(2013ZCX14)+1 种基金supported by the Development Project of Key Laboratory of Liaoning Provincethe Enterprise Postdoctoral Fund Projects of Liaoning Province
文摘Since the efficiency of photovoltaic(PV) power is closely related to the weather,many PV enterprises install weather instruments to monitor the working state of the PV power system.With the development of the soft measurement technology,the instrumental method seems obsolete and involves high cost.This paper proposes a novel method for predicting the types of weather based on the PV power data and partial meteorological data.By this method,the weather types are deduced by data analysis,instead of weather instrument A better fault detection is obtained by using the support vector machines(SVM) and comparing the predicted and the actual weather.The model of the weather prediction is established by a direct SVM for training multiclass predictors.Although SVM is suitable for classification,the classified results depend on the type of the kernel,the parameters of the kernel,and the soft margin coefficient,which are difficult to choose.In this paper,these parameters are optimized by particle swarm optimization(PSO) algorithm in anticipation of good prediction results can be achieved.Prediction results show that this method is feasible and effective.
基金Supported by GSU Molecular Basis of Disease Graduate Fellow, 2011-2012
文摘tmbalanced data is a common and serious problem in many biomedical classification tasks. It causes a bias on the training of classifiers and results in lower accuracy of minority classes prediction. This problem has attracted a lot of research interests in the past decade. Unfortunately, most research efforts only concentrate on 2-class problems. In this paper, we study a new method of formulating a multiclass Support Vector Machine (SVM) problem for imbalanced biomedical data to improve the classification performance. The proposed method applies cost-sensitive approach and ramp loss function to the Crammer and Singer multiclass SVM formulation. Experimental results on multiple biomedical datasets show that the proposed solution can effectively cure the problem when the datasets are noisy and highly imbalanced.
文摘It is common for datasets to contain both categorical and continuous variables. However, many feature screening methods designed for high-dimensional classification assume that the variables are continuous. This limits the applicability of existing methods in handling this complex scenario. To address this issue, we propose a model-free feature screening approach for ultra-high-dimensional multi-classification that can handle both categorical and continuous variables. Our proposed feature screening method utilizes the Maximal Information Coefficient to assess the predictive power of the variables. By satisfying certain regularity conditions, we have proven that our screening procedure possesses the sure screening property and ranking consistency properties. To validate the effectiveness of our approach, we conduct simulation studies and provide real data analysis examples to demonstrate its performance in finite samples. In summary, our proposed method offers a solution for effectively screening features in ultra-high-dimensional datasets with a mixture of categorical and continuous covariates.
文摘In this study, salting-out assisted liquid-liquid extraction combined with high performance liquid chromatography diode array detector (SALLE-HPLC-DAD) method was developed and validated for simultaneous analysis of carbaryl, atrazine, propazine, chlorothalonil, dimethametryn and terbutryn in environmental water samples. Parameters affecting the extraction efficiency such as type and volume of extraction solvent, sample volume, salt type and amount, centrifugation speed and time, and sample pH were optimized. Under the optimum extraction conditions the method was linear over the range of 10 - 100 μg/L (carbaryl), 8 - 100 μg/L (atarzine), 7 - 100 μg/L (propazine) and 9 - 100 μg/L (chlorothalonil, terbutryn and dimethametryn) with correlation coefficients (R2) between 0.99 and 0.999. Limits of detection and quantification ranged from 2.0 to 2.8 μg/L and 6.7 to 9.5 μg/L, respectively. The extraction recoveries obtained for ground, lake and river waters were in a range of 75.5% to 106.6%, with the intra-day and inter-day relative standard deviation lower than 3.4% for all the target analytes. All of the target analytes were not detected in these samples. Therefore, the proposed SALLE-HPLC-DAD method is simple, rapid, cheap and environmentally friendly for the determination of the aforementioned herbicides, insecticide and fungicide residues in environmental water samples.
文摘Traffic congestion problem is one of the major problems that face many transportation decision makers for urban areas. The problem has many impacts on social, economical and development aspects of urban areas. Hence the solution to this problem is not straight forward. It requires a lot of effort, expertise, time and cost that sometime are not available. Most of the existing transportation planning software, specially the most advanced ones, requires personnel with lots practical transportation planning experience and with high level of education and training. In this paper we propose a comprehensive framework for an Intelligent Decision Support System (IDSS) for Traffic Congestion Management System that utilizes a state of the art transportation network equilibrium modeling and providing an easy to use GIS-based interaction environment. The developed IDSS reduces the dependability on the expertise and level of education of the transportation planners, transportation engineers, or any transportation decision makers.
基金supported by the National Key R&D Program of China(2022YFB4701502)the“Leading Goose”R&D Program of Zhejiang(2023C01177)+1 种基金the Key Research Project of Zhejiang Lab(2021NB0AL03)the Key R&D Project on Agriculture and Social Development in Hangzhou City(Asian Games)(20230701 A05).
文摘Digital display instrument identification is a crucial approach for automating the collection of digital display data.In this study,we propose a digital display area detection CTPNpro algorithm to address the problem of recognizing multiclass digital display instruments.We developed a multiclass digital display instrument recognition algorithm by combining the character recognition network constructed using a convolutional neural network and bidirectional variable-length long short-term memory(LSTM).First,the digital display region detection CTPNpro network framework was designed based on the CTPN network architecture by introducing feature fusion and residual structure.Next,the digital display instrument identification network was constructed based on a convolutional neural network using twoway LSTM and Connectionist temporal classification(CTC)of indefinite length.Finally,an automatic calibration system for digital display instruments was built,and a multiclass digital display instrument dataset was constructed by sampling in the system.We compared the performance of the CTPNpro algorithm with other methods using this dataset to validate the effectiveness and robustness of the proposed algorithm.
文摘Quantum computing is a promising new approach to tackle the complex real-world computational problems by harnessing the power of quantum mechanics principles.The inherent parallelism and exponential computational power of quantum systems hold the potential to outpace classical counterparts in solving complex optimization problems,which are pervasive in machine learning.Quantum Support Vector Machine(QSVM)is a quantum machine learning algorithm inspired by classical Support Vector Machine(SVM)that exploits quantum parallelism to efficiently classify data points in high-dimensional feature spaces.We provide a comprehensive overview of the underlying principles of QSVM,elucidating how different quantum feature maps and quantum kernels enable the manipulation of quantum states to perform classification tasks.Through a comparative analysis,we reveal the quantum advantage achieved by these algorithms in terms of speedup and solution quality.As a case study,we explored the potential of quantum paradigms in the context of a real-world problem:classifying pancreatic cancer biomarker data.The Support Vector Classifier(SVC)algorithm was employed for the classical approach while the QSVM algorithm was executed on a quantum simulator provided by the Qiskit quantum computing framework.The classical approach as well as the quantum-based techniques reported similar accuracy.This uniformity suggests that these methods effectively captured similar underlying patterns in the dataset.Remarkably,quantum implementations exhibited substantially reduced execution times demonstrating the potential of quantum approaches in enhancing classification efficiency.This affirms the growing significance of quantum computing as a transformative tool for augmenting machine learning paradigms and also underscores the potency of quantum execution for computational acceleration.
基金partially funded by the National Natural Science Foundation of China (No. 61272447)the National Entrepreneurship&Innovation Demonstration Base of China (No. C700011)the Key Research&Development Project of Sichuan Province of China (No.2018G20100)。
文摘Botnets based on the Domain Generation Algorithm(DGA) mechanism pose great challenges to the main current detection methods because of their strong concealment and robustness. However, the complexity of the DGA family and the imbalance of samples continue to impede research on DGA detection. In the existing work, the sample size of each DGA family is regarded as the most important determinant of the resampling proportion;thus,differences in the characteristics of various samples are ignored, and the optimal resampling effect is not achieved.In this paper, a Long Short-Term Memory-based Property and Quantity Dependent Optimization(LSTM.PQDO)method is proposed. This method takes advantage of LSTM to automatically mine the comprehensive features of DGA domain names. It iterates the resampling proportion with the optimal solution based on a comprehensive consideration of the original number and characteristics of the samples to heuristically search for a better solution around the initial solution in the right direction;thus, dynamic optimization of the resampling proportion is realized.The experimental results show that the LSTM.PQDO method can achieve better performance compared with existing models to overcome the difficulties of unbalanced datasets;moreover, it can function as a reference for sample resampling tasks in similar scenarios.
文摘Feature extraction is the most critical step in classification of multispectral image.The classification accuracy is mainly influenced by the feature sets that are selected to classify the image.In the past,handcrafted feature sets are used which are not adaptive for different image domains.To overcome this,an evolu-tionary learning method is developed to automatically learn the spatial-spectral features for classification.A modified Firefly Algorithm(FA)which achieves maximum classification accuracy with reduced size of feature set is proposed to gain the interest of feature selection for this purpose.For extracting the most effi-cient features from the data set,we have used 3-D discrete wavelet transform which decompose the multispectral image in all three dimensions.For selecting spatial and spectral features we have studied three different approaches namely overlapping window(OW-3DFS),non-overlapping window(NW-3DFS)adaptive window cube(AW-3DFS)and Pixel based technique.Fivefold Multiclass Support Vector Machine(MSVM)is used for classification purpose.Experiments con-ducted on Madurai LISS IV multispectral image exploited that the adaptive win-dow approach is used to increase the classification accuracy.
文摘It is quite common that both categorical and continuous covariates appear in the data. But, most feature screening methods for ultrahigh-dimensional classification assume the covariates are continuous. And applicable feature screening method is very limited;to handle this non-trivial situation, we propose a model-free feature screening for ultrahigh-dimensional multi-classification with both categorical and continuous covariates. The proposed feature screening method will be based on Gini impurity to evaluate the prediction power of covariates. Under certain regularity conditions, it is proved that the proposed screening procedure possesses the sure screening property and ranking consistency properties. We demonstrate the finite sample performance of the proposed procedure by simulation studies and illustrate using real data analysis.
文摘Support vector machines (SVMs) are initially designed for binary classification. How to effectively extend them for multiclass classification is still an ongoing research topic. A multiclass classifier is constructed by combining SVM^light algorithm with directed acyclic graph SVM (DAGSVM) method, named DAGSVM^light A new method is proposed to select the working set which is identical to the working set selected by SVM^light approach. Experimental results indicate DAGSVM^light is competitive with DAGSMO. It is more suitable for practice use. It may be an especially useful tool for large-scale multiclass classification problems and lead to more widespread use of SVMs in the engineering community due to its good performance.
基金Project supported by the National Key Scientific Instrument and Equipment Development Project of China(No.2013YQ49087903)the National Natural Science Foundation of China(No.61402307)the Educational Commission of Sichuan Province,China(No.15ZA0007)
文摘Head pose estimation has been considered an important and challenging task in computer vision. In this paper we propose a novel method to estimate head pose based on a deep convolutional neural network (DCNN) for 2D face images. We design an effective and simple method to roughly crop the face from the input image, maintaining the individual-relative facial features ratio. The method can be used in various poses. Then two convolutional neural networks are set up to train the head pose classifier and then compared with each other. The simpler one has six layers. It performs well on seven yaw poses but is somewhat unsatisfactory when mixed in two pitch poses. The other has eight layers and more pixels in input layers. It has better performance on more poses and more training samples. Before training the network, two reasonable strategies including shift and zoom are executed to prepare training samples. Finally, feature extraction filters are optimized together with the weight of the classification component through training, to minimize the classification error. Our method has been evaluated on the CAS-PEAL-R1, CMU PIE, and CUBIC FacePix databases. It has better performance than state-of-the-art methods for head pose estimation.
基金This work was supported by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(No.NRF-2021R1A2C1010362)and the Soonchunhyang University Research Fund.
文摘While the usage of digital ocular fundus image has been widespread in ophthalmology practice,the interpretation of the image has been still on the hands of the ophthalmologists which are quite costly.We explored a robust deep learning system that detects three major ocular diseases:diabetic retinopathy(DR),glaucoma(GLC),and age-related macular degeneration(AMD).The proposed method is composed of two steps.First,an initial quality evaluation in the classification system is proposed to filter out poorquality images to enhance its performance,a technique that has not been explored previously.Second,the transfer learning technique is used with various convolutional neural networks(CNN)models that automatically learn a thousand features in the digital retinal image,and are based on those features for diagnosing eye diseases.Comparison performance of many models is conducted to find the optimal model which fits with fundus classification.Among the different CNN models,DenseNet-201 outperforms others with an area under the receiver operating characteristic curve of 0.99.Furthermore,the corresponding specificities for healthy,DR,GLC,andAMDpatients are found to be 89.52%,96.69%,89.58%,and 100%,respectively.These results demonstrate that the proposed method can reduce the time-consumption by automatically diagnosing multiple eye diseases using computer-aided assistance tools.