We study a new trust region affine scaling method for general bound constrained optimiza- tion problems. At each iteration, we compute two trial steps. We compute one along some direction obtained by solving an approp...We study a new trust region affine scaling method for general bound constrained optimiza- tion problems. At each iteration, we compute two trial steps. We compute one along some direction obtained by solving an appropriate quadratic model in an ellipsoidal region. This region is defined by an affine scaling technique. It depends on both the distances of current iterate to boundaries and the trust region radius. For convergence and avoiding iterations trapped around nonstationary points, an auxiliary step is defined along some newly defined approximate projected gradient. By choosing the one which achieves more reduction of the quadratic model from the two above steps as the trial step to generate next iterate, we prove that the iterates generated by the new algorithm are not bounded away from stationary points. And also assuming that the second-order sufficient condition holds at some nondegenerate stationary point, we prove the Q-linear convergence of the objective function values. Preliminary numerical experience for problems with bound constraints from the CUTEr collection is also reported.展开更多
This article introduces a novel low rank approximation (LRA)-based model to detect the functional regions with the data from about 15 million social media check-in records during a year-long period in Shanghai, China....This article introduces a novel low rank approximation (LRA)-based model to detect the functional regions with the data from about 15 million social media check-in records during a year-long period in Shanghai, China. We identified a series of latent structures, named latent spatio-temporal activity structures. While interpreting these structures, we can obtain a series of underlying associations between the spatial and temporal activity patterns. Moreover, we can not only reproduce the observed data with a lower dimensional representative, but also project spatio-temporal activity patterns in the same coordinate system. With the K-means clustering algorithm, five significant types of clusters that are directly annotated with a combination of temporal activities can be obtained, providing a clear picture of the correlation between the groups of regions and different activities at different times during a day. Besides the commercial and transportation dominant areas, we also detected two kinds of residential areas, the developed residential areas and the developing residential areas.We further interpret the spatial distribution of these clusters using urban form analytics. The results are highly consistent with the government planning in the same periods, indicating that our model is applicable to infer the functional regions from social media check-in data and can benefit a wide range of fields, such as urban planning, public services, and location-based recommender systems.展开更多
The species accumulation curve, or collector's curve, of a population gives the expected number of observed species or distinct classes as a function of sampling effort. Species accumulation curves allow researchers ...The species accumulation curve, or collector's curve, of a population gives the expected number of observed species or distinct classes as a function of sampling effort. Species accumulation curves allow researchers to assess and compare diversity across populations or to evaluate the benefits of additional sampling. Traditional applications have focused on ecological populations but emerging large-scale applications, for example in DNA sequencing, are orders of magnitude larger and present new challenges. We developed a method to estimate accumulation curves for predicting the complexity of DNA sequencing libraries. This method uses rational function approximations to a classical non- parametric empirical Bayes estimator due to Good and Toulmin [Biometrika, 1956, 43, 45~63]. Here we demonstrate how the same approach can be highly effective in other large-scale applications involving biological data sets. These include estimating microbial species richness, immune repertoire size, and R-mer diversity for genome assembly applications. We show how the method can be modified to address populations containing an effectively infinite number of species where saturation cannot practically be attained. We also introduce a flexible suite of tools implemented as an R package that make these methods broadly accessible.展开更多
首先,利用基于边界域粗糙近似算子,给出 n 阶边界集的定义,引入 n 阶粗糙近似算子的定义,构造粗糙集理论的一套阶梯式近似方法.然后,通过实例和相关证明表明,无论二元关系还是在覆盖环境中,总存在正整数 n ,对于任意对象集, n 阶上下近...首先,利用基于边界域粗糙近似算子,给出 n 阶边界集的定义,引入 n 阶粗糙近似算子的定义,构造粗糙集理论的一套阶梯式近似方法.然后,通过实例和相关证明表明,无论二元关系还是在覆盖环境中,总存在正整数 n ,对于任意对象集, n 阶上下近似集完全等于该对象集,即该对象集是此意义下的精确集,或其 n 阶上下近似集趋近于某一固定的对象集,即 n 阶粗糙集总能使对象集合趋近于它本身或某一固定的集合.展开更多
基金Supported by NSFC(Grant Nos.10831006and11021101)CAS(Grant No.kjcx-yw-s7)
文摘We study a new trust region affine scaling method for general bound constrained optimiza- tion problems. At each iteration, we compute two trial steps. We compute one along some direction obtained by solving an appropriate quadratic model in an ellipsoidal region. This region is defined by an affine scaling technique. It depends on both the distances of current iterate to boundaries and the trust region radius. For convergence and avoiding iterations trapped around nonstationary points, an auxiliary step is defined along some newly defined approximate projected gradient. By choosing the one which achieves more reduction of the quadratic model from the two above steps as the trial step to generate next iterate, we prove that the iterates generated by the new algorithm are not bounded away from stationary points. And also assuming that the second-order sufficient condition holds at some nondegenerate stationary point, we prove the Q-linear convergence of the objective function values. Preliminary numerical experience for problems with bound constraints from the CUTEr collection is also reported.
基金the Open Research Fund Program of Shenzhen Key Laboratory of Spatial Smart Sensing and Services%sponsored by the Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry(grant number 50-20150618)%National Natural Science Foundation of China (grant numbers 41001220, 51378512, 41571397, and 41501442)This work was also supported by the Special Program for Applied Research on Super Computation of the NSFC-Guangdong Joint Fund
文摘This article introduces a novel low rank approximation (LRA)-based model to detect the functional regions with the data from about 15 million social media check-in records during a year-long period in Shanghai, China. We identified a series of latent structures, named latent spatio-temporal activity structures. While interpreting these structures, we can obtain a series of underlying associations between the spatial and temporal activity patterns. Moreover, we can not only reproduce the observed data with a lower dimensional representative, but also project spatio-temporal activity patterns in the same coordinate system. With the K-means clustering algorithm, five significant types of clusters that are directly annotated with a combination of temporal activities can be obtained, providing a clear picture of the correlation between the groups of regions and different activities at different times during a day. Besides the commercial and transportation dominant areas, we also detected two kinds of residential areas, the developed residential areas and the developing residential areas.We further interpret the spatial distribution of these clusters using urban form analytics. The results are highly consistent with the government planning in the same periods, indicating that our model is applicable to infer the functional regions from social media check-in data and can benefit a wide range of fields, such as urban planning, public services, and location-based recommender systems.
文摘The species accumulation curve, or collector's curve, of a population gives the expected number of observed species or distinct classes as a function of sampling effort. Species accumulation curves allow researchers to assess and compare diversity across populations or to evaluate the benefits of additional sampling. Traditional applications have focused on ecological populations but emerging large-scale applications, for example in DNA sequencing, are orders of magnitude larger and present new challenges. We developed a method to estimate accumulation curves for predicting the complexity of DNA sequencing libraries. This method uses rational function approximations to a classical non- parametric empirical Bayes estimator due to Good and Toulmin [Biometrika, 1956, 43, 45~63]. Here we demonstrate how the same approach can be highly effective in other large-scale applications involving biological data sets. These include estimating microbial species richness, immune repertoire size, and R-mer diversity for genome assembly applications. We show how the method can be modified to address populations containing an effectively infinite number of species where saturation cannot practically be attained. We also introduce a flexible suite of tools implemented as an R package that make these methods broadly accessible.
文摘首先,利用基于边界域粗糙近似算子,给出 n 阶边界集的定义,引入 n 阶粗糙近似算子的定义,构造粗糙集理论的一套阶梯式近似方法.然后,通过实例和相关证明表明,无论二元关系还是在覆盖环境中,总存在正整数 n ,对于任意对象集, n 阶上下近似集完全等于该对象集,即该对象集是此意义下的精确集,或其 n 阶上下近似集趋近于某一固定的对象集,即 n 阶粗糙集总能使对象集合趋近于它本身或某一固定的集合.