A lightweight malware detection and family classification system for the Internet of Things (IoT) was designed to solve the difficulty of deploying defense models caused by the limited computing and storage resources ...A lightweight malware detection and family classification system for the Internet of Things (IoT) was designed to solve the difficulty of deploying defense models caused by the limited computing and storage resources of IoT devices. By training complex models with IoT software gray-scale images and utilizing the gradient-weighted class-activated mapping technique, the system can identify key codes that influence model decisions. This allows for the reconstruction of gray-scale images to train a lightweight model called LMDNet for malware detection. Additionally, the multi-teacher knowledge distillation method is employed to train KD-LMDNet, which focuses on classifying malware families. The results indicate that the model’s identification speed surpasses that of traditional methods by 23.68%. Moreover, the accuracy achieved on the Malimg dataset for family classification is an impressive 99.07%. Furthermore, with a model size of only 0.45M, it appears to be well-suited for the IoT environment. By training complex models using IoT software gray-scale images and utilizing the gradient-weighted class-activated mapping technique, the system can identify key codes that influence model decisions. This allows for the reconstruction of gray-scale images to train a lightweight model called LMDNet for malware detection. Thus, the presented approach can address the challenges associated with malware detection and family classification in IoT devices.展开更多
To extract strong correlations between different energy loads and improve the interpretability and accuracy for load forecasting of a regional integrated energy system(RIES),an explainable framework for load forecasti...To extract strong correlations between different energy loads and improve the interpretability and accuracy for load forecasting of a regional integrated energy system(RIES),an explainable framework for load forecasting of an RIES is proposed.This includes the load forecasting model of RIES and its interpretation.A coupled feature extracting strat-egy is adopted to construct coupled features between loads as the input variables of the model.It is designed based on multi-task learning(MTL)with a long short-term memory(LSTM)model as the sharing layer.Based on SHapley Additive exPlanations(SHAP),this explainable framework combines global and local interpretations to improve the interpretability of load forecasting of the RIES.In addition,an input variable selection strategy based on the global SHAP value is proposed to select input feature variables of the model.A case study is given to verify the effectiveness of the proposed model,constructed coupled features,and input variable selection strategy.The results show that the explainable framework intuitively improves the interpretability of the prediction model.展开更多
Amyotrophic lateral sclerosis(ALS)is a devastating neurodegenerative disease characterized by progressive neuronal loss and degeneration of upper motor neuron(UMN)and lower motor neuron(LMN).The clinical presentations...Amyotrophic lateral sclerosis(ALS)is a devastating neurodegenerative disease characterized by progressive neuronal loss and degeneration of upper motor neuron(UMN)and lower motor neuron(LMN).The clinical presentations of ALS are heterogeneous and there is no single test or procedure to establish the diagnosis of ALS.Most cases are diagnosed based on symptoms,physical signs,progression,EMG,and tests to exclude the overlapping conditions.Familial ALS represents about 5~10% of ALS cases,whereas the vast majority of patients are sporadic.To date,more than 20 causative genes have been identified in hereditary ALS.Detecting the pathogenic mutations or risk variants for each ALS individual is challenging.However,ALS patients carrying some specific mutations or variant may exhibit subtly distinct clinical features.Unraveling the respective genotype-phenotype correlation has important implications for the genetic explanations.In this review,we will delineate the clinical features of ALS,outline the major ALS-related genes,and summarize the possible genotype-phenotype correlations of ALS.展开更多
Colletotrichum kahawae(Coffee Berry Disease)spreads through spores that can be carried by wind,rain,and insects affecting coffee plantations,and causes 80%yield losses and poor-quality coffee beans.The deadly disease ...Colletotrichum kahawae(Coffee Berry Disease)spreads through spores that can be carried by wind,rain,and insects affecting coffee plantations,and causes 80%yield losses and poor-quality coffee beans.The deadly disease is hard to control because wind,rain,and insects carry spores.Colombian researchers utilized a deep learning system to identify CBD in coffee cherries at three growth stages and classify photographs of infected and uninfected cherries with 93%accuracy using a random forest method.If the dataset is too small and noisy,the algorithm may not learn data patterns and generate accurate predictions.To overcome the existing challenge,early detection of Colletotrichum Kahawae disease in coffee cherries requires automated processes,prompt recognition,and accurate classifications.The proposed methodology selects CBD image datasets through four different stages for training and testing.XGBoost to train a model on datasets of coffee berries,with each image labeled as healthy or diseased.Once themodel is trained,SHAP algorithmto figure out which features were essential formaking predictions with the proposed model.Some of these characteristics were the cherry’s colour,whether it had spots or other damage,and how big the Lesions were.Virtual inception is important for classification to virtualize the relationship between the colour of the berry is correlated with the presence of disease.To evaluate themodel’s performance andmitigate excess fitting,a 10-fold cross-validation approach is employed.This involves partitioning the dataset into ten subsets,training the model on each subset,and evaluating its performance.In comparison to other contemporary methodologies,the model put forth achieved an accuracy of 98.56%.展开更多
Landslide inventory is an indispensable output variable of landslide susceptibility prediction(LSP)modelling.However,the influence of landslide inventory incompleteness on LSP and the transfer rules of LSP resulting e...Landslide inventory is an indispensable output variable of landslide susceptibility prediction(LSP)modelling.However,the influence of landslide inventory incompleteness on LSP and the transfer rules of LSP resulting error in the model have not been explored.Adopting Xunwu County,China,as an example,the existing landslide inventory is first obtained and assumed to contain all landslide inventory samples under ideal conditions,after which different landslide inventory sample missing conditions are simulated by random sampling.It includes the condition that the landslide inventory samples in the whole study area are missing randomly at the proportions of 10%,20%,30%,40%and 50%,as well as the condition that the landslide inventory samples in the south of Xunwu County are missing in aggregation.Then,five machine learning models,namely,Random Forest(RF),and Support Vector Machine(SVM),are used to perform LSP.Finally,the LSP results are evaluated to analyze the LSP uncertainties under various conditions.In addition,this study introduces various interpretability methods of machine learning model to explore the changes in the decision basis of the RF model under various conditions.Results show that(1)randomly missing landslide inventory samples at certain proportions(10%–50%)may affect the LSP results for local areas.(2)Aggregation of missing landslide inventory samples may cause significant biases in LSP,particularly in areas where samples are missing.(3)When 50%of landslide samples are missing(either randomly or aggregated),the changes in the decision basis of the RF model are mainly manifested in two aspects:first,the importance ranking of environmental factors slightly differs;second,in regard to LSP modelling in the same test grid unit,the weights of individual model factors may drastically vary.展开更多
文摘A lightweight malware detection and family classification system for the Internet of Things (IoT) was designed to solve the difficulty of deploying defense models caused by the limited computing and storage resources of IoT devices. By training complex models with IoT software gray-scale images and utilizing the gradient-weighted class-activated mapping technique, the system can identify key codes that influence model decisions. This allows for the reconstruction of gray-scale images to train a lightweight model called LMDNet for malware detection. Additionally, the multi-teacher knowledge distillation method is employed to train KD-LMDNet, which focuses on classifying malware families. The results indicate that the model’s identification speed surpasses that of traditional methods by 23.68%. Moreover, the accuracy achieved on the Malimg dataset for family classification is an impressive 99.07%. Furthermore, with a model size of only 0.45M, it appears to be well-suited for the IoT environment. By training complex models using IoT software gray-scale images and utilizing the gradient-weighted class-activated mapping technique, the system can identify key codes that influence model decisions. This allows for the reconstruction of gray-scale images to train a lightweight model called LMDNet for malware detection. Thus, the presented approach can address the challenges associated with malware detection and family classification in IoT devices.
基金supported in part by the National Key Research Program of China (2016YFB0900100)Key Project of Shanghai Science and Technology Committee (18DZ1100303).
文摘To extract strong correlations between different energy loads and improve the interpretability and accuracy for load forecasting of a regional integrated energy system(RIES),an explainable framework for load forecasting of an RIES is proposed.This includes the load forecasting model of RIES and its interpretation.A coupled feature extracting strat-egy is adopted to construct coupled features between loads as the input variables of the model.It is designed based on multi-task learning(MTL)with a long short-term memory(LSTM)model as the sharing layer.Based on SHapley Additive exPlanations(SHAP),this explainable framework combines global and local interpretations to improve the interpretability of load forecasting of the RIES.In addition,an input variable selection strategy based on the global SHAP value is proposed to select input feature variables of the model.A case study is given to verify the effectiveness of the proposed model,constructed coupled features,and input variable selection strategy.The results show that the explainable framework intuitively improves the interpretability of the prediction model.
基金This work was supported by grants from the National Natural Science Foundation to Zhi-Ying Wu(81125009,Beijing).
文摘Amyotrophic lateral sclerosis(ALS)is a devastating neurodegenerative disease characterized by progressive neuronal loss and degeneration of upper motor neuron(UMN)and lower motor neuron(LMN).The clinical presentations of ALS are heterogeneous and there is no single test or procedure to establish the diagnosis of ALS.Most cases are diagnosed based on symptoms,physical signs,progression,EMG,and tests to exclude the overlapping conditions.Familial ALS represents about 5~10% of ALS cases,whereas the vast majority of patients are sporadic.To date,more than 20 causative genes have been identified in hereditary ALS.Detecting the pathogenic mutations or risk variants for each ALS individual is challenging.However,ALS patients carrying some specific mutations or variant may exhibit subtly distinct clinical features.Unraveling the respective genotype-phenotype correlation has important implications for the genetic explanations.In this review,we will delineate the clinical features of ALS,outline the major ALS-related genes,and summarize the possible genotype-phenotype correlations of ALS.
基金support from the Deanship for Research&Innovation,Ministry of Education in Saudi Arabia,under the Auspices of Project Number:IFP22UQU4281768DSR122.
文摘Colletotrichum kahawae(Coffee Berry Disease)spreads through spores that can be carried by wind,rain,and insects affecting coffee plantations,and causes 80%yield losses and poor-quality coffee beans.The deadly disease is hard to control because wind,rain,and insects carry spores.Colombian researchers utilized a deep learning system to identify CBD in coffee cherries at three growth stages and classify photographs of infected and uninfected cherries with 93%accuracy using a random forest method.If the dataset is too small and noisy,the algorithm may not learn data patterns and generate accurate predictions.To overcome the existing challenge,early detection of Colletotrichum Kahawae disease in coffee cherries requires automated processes,prompt recognition,and accurate classifications.The proposed methodology selects CBD image datasets through four different stages for training and testing.XGBoost to train a model on datasets of coffee berries,with each image labeled as healthy or diseased.Once themodel is trained,SHAP algorithmto figure out which features were essential formaking predictions with the proposed model.Some of these characteristics were the cherry’s colour,whether it had spots or other damage,and how big the Lesions were.Virtual inception is important for classification to virtualize the relationship between the colour of the berry is correlated with the presence of disease.To evaluate themodel’s performance andmitigate excess fitting,a 10-fold cross-validation approach is employed.This involves partitioning the dataset into ten subsets,training the model on each subset,and evaluating its performance.In comparison to other contemporary methodologies,the model put forth achieved an accuracy of 98.56%.
基金the National Natural Science Foundation of China(Nos.42377164,41972280 and 42272326)National Natural Science Outstanding Youth Foundation of China(No.52222905)+1 种基金Natural Science Foundation of Jiangxi Province,China(No.20232BAB204091)Natural Science Foundation of Jiangxi Province,China(No.20232BAB204077).
文摘Landslide inventory is an indispensable output variable of landslide susceptibility prediction(LSP)modelling.However,the influence of landslide inventory incompleteness on LSP and the transfer rules of LSP resulting error in the model have not been explored.Adopting Xunwu County,China,as an example,the existing landslide inventory is first obtained and assumed to contain all landslide inventory samples under ideal conditions,after which different landslide inventory sample missing conditions are simulated by random sampling.It includes the condition that the landslide inventory samples in the whole study area are missing randomly at the proportions of 10%,20%,30%,40%and 50%,as well as the condition that the landslide inventory samples in the south of Xunwu County are missing in aggregation.Then,five machine learning models,namely,Random Forest(RF),and Support Vector Machine(SVM),are used to perform LSP.Finally,the LSP results are evaluated to analyze the LSP uncertainties under various conditions.In addition,this study introduces various interpretability methods of machine learning model to explore the changes in the decision basis of the RF model under various conditions.Results show that(1)randomly missing landslide inventory samples at certain proportions(10%–50%)may affect the LSP results for local areas.(2)Aggregation of missing landslide inventory samples may cause significant biases in LSP,particularly in areas where samples are missing.(3)When 50%of landslide samples are missing(either randomly or aggregated),the changes in the decision basis of the RF model are mainly manifested in two aspects:first,the importance ranking of environmental factors slightly differs;second,in regard to LSP modelling in the same test grid unit,the weights of individual model factors may drastically vary.