Spatio-temporal models are valuable tools for disease mapping and understanding the geographical distribution of diseases and temporal dynamics. Spatio-temporal models have been proven empirically to be very complex a...Spatio-temporal models are valuable tools for disease mapping and understanding the geographical distribution of diseases and temporal dynamics. Spatio-temporal models have been proven empirically to be very complex and this complexity has led many to oversimply and model the spatial and temporal dependencies independently. Unlike common practice, this study formulated a new spatio-temporal model in a Bayesian hierarchical framework that accounts for spatial and temporal dependencies jointly. The spatial and temporal dependencies were dynamically modelled via the matern exponential covariance function. The temporal aspect was captured by the parameters of the exponential with a first-order autoregressive structure. Inferences about the parameters were obtained via Markov Chain Monte Carlo (MCMC) techniques and the spatio-temporal maps were obtained by mapping stable posterior means from the specific location and time from the best model that includes the significant risk factors. The model formulated was fitted to both simulation data and Kenya meningitis incidence data from 2013 to 2019 along with two covariates;Gross County Product (GCP) and average rainfall. The study found that both average rainfall and GCP had a significant positive association with meningitis occurrence. Also, regarding geographical distribution, the spatio-temporal maps showed that meningitis is not evenly distributed across the country as some counties reported a high number of cases compared with other counties.展开更多
In multi-dimensional classification(MDC), the semantics of objects are characterized by multiple class spaces from different dimensions. Most MDC approaches try to explicitly model the dependencies among class spaces ...In multi-dimensional classification(MDC), the semantics of objects are characterized by multiple class spaces from different dimensions. Most MDC approaches try to explicitly model the dependencies among class spaces in output space. In contrast, the recently proposed feature augmentation strategy, which aims at manipulating feature space, has also been shown to be an effective solution for MDC. However, existing feature augmentation approaches only focus on designing holistic augmented features to be appended with the original features, while better generalization performance could be achieved by exploiting multiple kinds of augmented features.In this paper, we propose the selective feature augmentation strategy that focuses on synergizing multiple kinds of augmented features.Specifically, by assuming that only part of the augmented features is pertinent and useful for each dimension′s model induction, we derive a classification model which can fully utilize the original features while conduct feature selection for the augmented features. To validate the effectiveness of the proposed strategy, we generate three kinds of simple augmented features based on standard k NN, weighted k NN, and maximum margin techniques, respectively. Comparative studies show that the proposed strategy achieves superior performance against both state-of-the-art MDC approaches and its degenerated versions with either kind of augmented features.展开更多
The semiconductor industry typifies the international division of labor and exhibits significant structural differences in global trade in key product segments.The evolution of cross-border trade flows and dependency ...The semiconductor industry typifies the international division of labor and exhibits significant structural differences in global trade in key product segments.The evolution of cross-border trade flows and dependency relationships,as well as trade organization patterns of manufactured products,equipment and materials for manufacturing,are investigated by constructing a global semiconductor trade relationship matrix and using the Gini coefficient and trade dependency index.It was found that:(1)the global semiconductor trade is highly spatially unbalanced,with materials and equipment trade in particular highly concentrated in a few countries on both the supply and demand sides;(2)China has replaced the US as the largest global semiconductor trade player and has shaped the regionalized system of manufactured goods and materials trade with East and Southeast Asian economies,but its equipment trade is highly dependent on Europe and the US;(3)the semiconductor production model has promoted the regionalization of the east and southeast Asia region in the trade of manufactured products and materials,and developed economies such as the US,the EU,Japan,and South Korea have maintained their monopolistic advantage in the trade of semiconductor equipment by building exclusive innovation networks and establishing trade barriers.The monopolistic nature of the semiconductor equipment trade and the regionalization of manufactured goods and materials have formed the characteristics of the global semiconductor trade and are likely to be further strengthened in future trade.展开更多
Effort estimation plays a crucial role in software development projects,aiding in resource allocation,project planning,and risk management.Traditional estimation techniques often struggle to provide accurate estimates...Effort estimation plays a crucial role in software development projects,aiding in resource allocation,project planning,and risk management.Traditional estimation techniques often struggle to provide accurate estimates due to the complex nature of software projects.In recent years,machine learning approaches have shown promise in improving the accuracy of effort estimation models.This study proposes a hybrid model that combines Long Short-Term Memory(LSTM)and Random Forest(RF)algorithms to enhance software effort estimation.The proposed hybrid model takes advantage of the strengths of both LSTM and RF algorithms.To evaluate the performance of the hybrid model,an extensive set of software development projects is used as the experimental dataset.The experimental results demonstrate that the proposed hybrid model outperforms traditional estimation techniques in terms of accuracy and reliability.The integration of LSTM and RF enables the model to efficiently capture temporal dependencies and non-linear interactions in the software development data.The hybrid model enhances estimation accuracy,enabling project managers and stakeholders to make more precise predictions of effort needed for upcoming software projects.展开更多
N6-methyladenosine(m6A)is an important RNA methylation modification involved in regulating diverse biological processes across multiple species.Hence,the identification of m6A modification sites provides valuable insi...N6-methyladenosine(m6A)is an important RNA methylation modification involved in regulating diverse biological processes across multiple species.Hence,the identification of m6A modification sites provides valuable insight into the biological mechanisms of complex diseases at the post-transcriptional level.Although a variety of identification algorithms have been proposed recently,most of them capture the features of m6A modification sites by focusing on the sequential dependencies of nucleotides at different positions in RNA sequences,while ignoring the structural dependencies of nucleotides in their threedimensional structures.To overcome this issue,we propose a cross-species end-to-end deep learning model,namely CR-NSSD,which conduct a cross-domain representation learning process integrating nucleotide structural and sequential dependencies for RNA m6A site identification.Specifically,CR-NSSD first obtains the pre-coded representations of RNA sequences by incorporating the position information into single-nucleotide states with chaos game representation theory.It then constructs a crossdomain reconstruction encoder to learn the sequential and structural dependencies between nucleotides.By minimizing the reconstruction and binary cross-entropy losses,CR-NSSD is trained to complete the task of m6A site identification.Extensive experiments have demonstrated the promising performance of CR-NSSD by comparing it with several state-of-the-art m6A identification algorithms.Moreover,the results of cross-species prediction indicate that the integration of sequential and structural dependencies allows CR-NSSD to capture general features of m6A modification sites among different species,thus improving the accuracy of cross-species identification.展开更多
Accurate demand forecasting for online ride-hailing contributes to balancing traffic supply and demand,and improving the service level of ride-hailing platforms.In contrast to previous studies,which have primarily foc...Accurate demand forecasting for online ride-hailing contributes to balancing traffic supply and demand,and improving the service level of ride-hailing platforms.In contrast to previous studies,which have primarily focused on the inflow or outflow demands of each zone,this study proposes a conditional generative adversarial network with a Wasserstein divergence objective(CWGAN-div)to predict ride-hailing origin-destination(OD)demand matrices.Residual blocks and refined loss functions help to enhance the stability of model training.Interpretable conditional information is employed to capture external spatiotemporal dependencies and guide the model towards generating more precise results.Empirical analysis using ride-hailing data from Manhattan,New York City,demon-strates that our proposed CWGAN-div model can effectively predict the network-wide OD matrix and exhibits strong convergence performance.Comparative experiments also show that the CWGAN-div outperforms other benchmarking methods.Consequently,the proposed model displays potential for network-wide ride-hailing OD demand prediction.展开更多
Thucydides asserts that the occupation of Decelea by the Spartans in 413 BC made the grain supply for Athens costly by forcing the transport from land onto the sea.This calls into question the well-established consens...Thucydides asserts that the occupation of Decelea by the Spartans in 413 BC made the grain supply for Athens costly by forcing the transport from land onto the sea.This calls into question the well-established consensus that sea transport was far cheaper than land transport.This paper contends that the cost of protecting supply lines-specifically the expenses associated with the warships which escorted the supply ships-rendered the grain transported on the new route exceptionally costly.In this paper,the benefits and drawbacks of a maritime economy,including transaction costs,trade dependencies,and the capabilities of warships and supply ships are discussed.展开更多
Matching dependencies (MDs) are used to declaratively specify the identification (or matching) of cer- tain attribute values in pairs of database tuples when some similarity conditions on other values are satisfie...Matching dependencies (MDs) are used to declaratively specify the identification (or matching) of cer- tain attribute values in pairs of database tuples when some similarity conditions on other values are satisfied. Their en- forcement can be seen as a natural generalization of entity resolution. In what we call the pure case of MD enforce- ment, an arbitrary value from the underlying data domain can be used for the value in common that is used for a match- ing. However, the overall number of changes of attribute val- ues is expected to be kept to a minimum. We investigate this case in terms of semantics and the properties of data clean- ing through the enforcement of MDs. We characterize the in- tended clean instances, and also the clean answers to queries, as those that are invariant under the cleaning process. The complexity of computing clean instances and clean query an- swering is investigated. Tractable and intractable cases de- pending on the MDs are identified and characterized.展开更多
文摘Spatio-temporal models are valuable tools for disease mapping and understanding the geographical distribution of diseases and temporal dynamics. Spatio-temporal models have been proven empirically to be very complex and this complexity has led many to oversimply and model the spatial and temporal dependencies independently. Unlike common practice, this study formulated a new spatio-temporal model in a Bayesian hierarchical framework that accounts for spatial and temporal dependencies jointly. The spatial and temporal dependencies were dynamically modelled via the matern exponential covariance function. The temporal aspect was captured by the parameters of the exponential with a first-order autoregressive structure. Inferences about the parameters were obtained via Markov Chain Monte Carlo (MCMC) techniques and the spatio-temporal maps were obtained by mapping stable posterior means from the specific location and time from the best model that includes the significant risk factors. The model formulated was fitted to both simulation data and Kenya meningitis incidence data from 2013 to 2019 along with two covariates;Gross County Product (GCP) and average rainfall. The study found that both average rainfall and GCP had a significant positive association with meningitis occurrence. Also, regarding geographical distribution, the spatio-temporal maps showed that meningitis is not evenly distributed across the country as some counties reported a high number of cases compared with other counties.
基金supported by National Science Foundation of China (No. 62176055)China University S&T Innovation Plan Guided by the Ministry of Education。
文摘In multi-dimensional classification(MDC), the semantics of objects are characterized by multiple class spaces from different dimensions. Most MDC approaches try to explicitly model the dependencies among class spaces in output space. In contrast, the recently proposed feature augmentation strategy, which aims at manipulating feature space, has also been shown to be an effective solution for MDC. However, existing feature augmentation approaches only focus on designing holistic augmented features to be appended with the original features, while better generalization performance could be achieved by exploiting multiple kinds of augmented features.In this paper, we propose the selective feature augmentation strategy that focuses on synergizing multiple kinds of augmented features.Specifically, by assuming that only part of the augmented features is pertinent and useful for each dimension′s model induction, we derive a classification model which can fully utilize the original features while conduct feature selection for the augmented features. To validate the effectiveness of the proposed strategy, we generate three kinds of simple augmented features based on standard k NN, weighted k NN, and maximum margin techniques, respectively. Comparative studies show that the proposed strategy achieves superior performance against both state-of-the-art MDC approaches and its degenerated versions with either kind of augmented features.
基金National Natural Science Foundation of China,No.42130712。
文摘The semiconductor industry typifies the international division of labor and exhibits significant structural differences in global trade in key product segments.The evolution of cross-border trade flows and dependency relationships,as well as trade organization patterns of manufactured products,equipment and materials for manufacturing,are investigated by constructing a global semiconductor trade relationship matrix and using the Gini coefficient and trade dependency index.It was found that:(1)the global semiconductor trade is highly spatially unbalanced,with materials and equipment trade in particular highly concentrated in a few countries on both the supply and demand sides;(2)China has replaced the US as the largest global semiconductor trade player and has shaped the regionalized system of manufactured goods and materials trade with East and Southeast Asian economies,but its equipment trade is highly dependent on Europe and the US;(3)the semiconductor production model has promoted the regionalization of the east and southeast Asia region in the trade of manufactured products and materials,and developed economies such as the US,the EU,Japan,and South Korea have maintained their monopolistic advantage in the trade of semiconductor equipment by building exclusive innovation networks and establishing trade barriers.The monopolistic nature of the semiconductor equipment trade and the regionalization of manufactured goods and materials have formed the characteristics of the global semiconductor trade and are likely to be further strengthened in future trade.
文摘Effort estimation plays a crucial role in software development projects,aiding in resource allocation,project planning,and risk management.Traditional estimation techniques often struggle to provide accurate estimates due to the complex nature of software projects.In recent years,machine learning approaches have shown promise in improving the accuracy of effort estimation models.This study proposes a hybrid model that combines Long Short-Term Memory(LSTM)and Random Forest(RF)algorithms to enhance software effort estimation.The proposed hybrid model takes advantage of the strengths of both LSTM and RF algorithms.To evaluate the performance of the hybrid model,an extensive set of software development projects is used as the experimental dataset.The experimental results demonstrate that the proposed hybrid model outperforms traditional estimation techniques in terms of accuracy and reliability.The integration of LSTM and RF enables the model to efficiently capture temporal dependencies and non-linear interactions in the software development data.The hybrid model enhances estimation accuracy,enabling project managers and stakeholders to make more precise predictions of effort needed for upcoming software projects.
基金supported in part by the National Natural Science Foundation of China(62373348)the Natural Science Foundation of Xinjiang Uygur Autonomous Region(2021D01D05)+1 种基金the Tianshan Talent Training Program(2023TSYCLJ0021)the Pioneer Hundred Talents Program of Chinese Academy of Sciences.
文摘N6-methyladenosine(m6A)is an important RNA methylation modification involved in regulating diverse biological processes across multiple species.Hence,the identification of m6A modification sites provides valuable insight into the biological mechanisms of complex diseases at the post-transcriptional level.Although a variety of identification algorithms have been proposed recently,most of them capture the features of m6A modification sites by focusing on the sequential dependencies of nucleotides at different positions in RNA sequences,while ignoring the structural dependencies of nucleotides in their threedimensional structures.To overcome this issue,we propose a cross-species end-to-end deep learning model,namely CR-NSSD,which conduct a cross-domain representation learning process integrating nucleotide structural and sequential dependencies for RNA m6A site identification.Specifically,CR-NSSD first obtains the pre-coded representations of RNA sequences by incorporating the position information into single-nucleotide states with chaos game representation theory.It then constructs a crossdomain reconstruction encoder to learn the sequential and structural dependencies between nucleotides.By minimizing the reconstruction and binary cross-entropy losses,CR-NSSD is trained to complete the task of m6A site identification.Extensive experiments have demonstrated the promising performance of CR-NSSD by comparing it with several state-of-the-art m6A identification algorithms.Moreover,the results of cross-species prediction indicate that the integration of sequential and structural dependencies allows CR-NSSD to capture general features of m6A modification sites among different species,thus improving the accuracy of cross-species identification.
基金supported by the National Natural Science Foundation of China(Grant No.72371251)the National Science Foundation for Distinguished Young Scholars of Hunan Province(Grant No.2024JJ2080)+1 种基金the Excellent Youth Foundation of Hunan Education Department(Grant No.21B0015)the State Key Lab-oratory of Rail Traffic Control and Safety of Beijing Jiaotong Uni-v ersity,China(Gr ant No.RCS2022K004).
文摘Accurate demand forecasting for online ride-hailing contributes to balancing traffic supply and demand,and improving the service level of ride-hailing platforms.In contrast to previous studies,which have primarily focused on the inflow or outflow demands of each zone,this study proposes a conditional generative adversarial network with a Wasserstein divergence objective(CWGAN-div)to predict ride-hailing origin-destination(OD)demand matrices.Residual blocks and refined loss functions help to enhance the stability of model training.Interpretable conditional information is employed to capture external spatiotemporal dependencies and guide the model towards generating more precise results.Empirical analysis using ride-hailing data from Manhattan,New York City,demon-strates that our proposed CWGAN-div model can effectively predict the network-wide OD matrix and exhibits strong convergence performance.Comparative experiments also show that the CWGAN-div outperforms other benchmarking methods.Consequently,the proposed model displays potential for network-wide ride-hailing OD demand prediction.
文摘Thucydides asserts that the occupation of Decelea by the Spartans in 413 BC made the grain supply for Athens costly by forcing the transport from land onto the sea.This calls into question the well-established consensus that sea transport was far cheaper than land transport.This paper contends that the cost of protecting supply lines-specifically the expenses associated with the warships which escorted the supply ships-rendered the grain transported on the new route exceptionally costly.In this paper,the benefits and drawbacks of a maritime economy,including transaction costs,trade dependencies,and the capabilities of warships and supply ships are discussed.
文摘Matching dependencies (MDs) are used to declaratively specify the identification (or matching) of cer- tain attribute values in pairs of database tuples when some similarity conditions on other values are satisfied. Their en- forcement can be seen as a natural generalization of entity resolution. In what we call the pure case of MD enforce- ment, an arbitrary value from the underlying data domain can be used for the value in common that is used for a match- ing. However, the overall number of changes of attribute val- ues is expected to be kept to a minimum. We investigate this case in terms of semantics and the properties of data clean- ing through the enforcement of MDs. We characterize the in- tended clean instances, and also the clean answers to queries, as those that are invariant under the cleaning process. The complexity of computing clean instances and clean query an- swering is investigated. Tractable and intractable cases de- pending on the MDs are identified and characterized.