A multifrontal code is introduced for the efficient solution of the linear system of equations arising from the analysis of structures. The factorization phase is reduced into a series of interleaved element assembly ...A multifrontal code is introduced for the efficient solution of the linear system of equations arising from the analysis of structures. The factorization phase is reduced into a series of interleaved element assembly and dense matrix operations for which the BLAS3 kernels are used. A similar approach is generalized for the forward and back substitution phases for the efficient solution of structures having multiple load conditions. The program performs all assembly and solution steps in parallel. Examples are presented which demonstrate the code’s performance on single and dual core processor computers.展开更多
In this paper,a set of closed-form formulas for vector Finite Element Method(FEM) to analyze three dimensional electromagnetic problems is presented on the basis of Gaussian quadrature integration scheme.By analyzing ...In this paper,a set of closed-form formulas for vector Finite Element Method(FEM) to analyze three dimensional electromagnetic problems is presented on the basis of Gaussian quadrature integration scheme.By analyzing the open region problems,the first-order Absorbing Boundary Condition(ABC) is considered as the truncation boundary condition and the equation is carried out in a closed-form.Based on the formulas,the hybrid Expanded Cholesky Method(ECM) and MultiFrontal algorithm(MF) is applied to solve finite element equations.Using the closed-form solution,the elec-tromagnetic field of three dimensional targets can be studied sententiously and accurately.Simulation results show that the presented formulas are successfully and concise,which can be easily used to analyze three dimensional electromagnetic problems.展开更多
Based on the two-dimensional three-temperature (2D3T) radiation diffusion equations and its discrete system, using the block diagonal structure of the three-temperature matrix, the reordering and symbolic decomposit...Based on the two-dimensional three-temperature (2D3T) radiation diffusion equations and its discrete system, using the block diagonal structure of the three-temperature matrix, the reordering and symbolic decomposition parts of the RSMF method are replaced with corresponding block operation in order to improve the solution efficiency. We call this block form method block RSMF (in brief, BRSMF) method. The new BRSMF method not only makes the reordering and symbolic decomposition become more effective, but also keeps the cost of numerical factorization from increasing and ensures the precision of solution very well. The theoretical analysis of the computation complexity about the new BRSMF method shows that the solution efficiency about the BRSMF method is higher than the original RSMF method. The numerical experiments also show that the new BRSMF method is more effective than the original RSMF method.展开更多
This paper is concerned with the fast iterative solution of linear systems arising from finite difference discretizations in electromagnetics. The sweeping preconditioner with moving perfectly matched layers previousl...This paper is concerned with the fast iterative solution of linear systems arising from finite difference discretizations in electromagnetics. The sweeping preconditioner with moving perfectly matched layers previously developed for the Helmholtz equation is adapted for the popular Yee grid scheme for wave propagation in inhomogeneous, anisotropic media. Preliminary numerical results are presented for typical examples.展开更多
In this paper, an absorbing Fictitious Boundary Condition (FBC) is presented to generate an iterative Domain Decomposition Method (DDM) for analyzing waveguide problems.The relaxed algorithm is introduced to improve t...In this paper, an absorbing Fictitious Boundary Condition (FBC) is presented to generate an iterative Domain Decomposition Method (DDM) for analyzing waveguide problems.The relaxed algorithm is introduced to improve the iterative convergence. And the matrix equations are solved using the multifrontal algorithm. The resulting CPU time is greatly reduced.Finally, a number of numerical examples are given to illustrate its accuracy and efficiency.展开更多
This paper describes a method of calculating the Schur complement of a sparse positive definite matrix A. The main idea of this approach is to represent matrix A in the form of an elimination tree using a reordering a...This paper describes a method of calculating the Schur complement of a sparse positive definite matrix A. The main idea of this approach is to represent matrix A in the form of an elimination tree using a reordering algorithm like METIS and putting columns/rows for which the Schur complement is needed into the top node of the elimination tree. Any problem with a degenerate part of the initial matrix can be resolved with the help of iterative refinement. The proposed approach is close to the “multifrontal” one which was implemented by Ian Duff and others in 1980s. Schur complement computations described in this paper are available in Intel®Math Kernel Library (Intel®MKL). In this paper we present the algorithm for Schur complement computations, experiments that demonstrate a negligible increase in the number of elements in the factored matrix, and comparison with existing alternatives.展开更多
The paper describes an efficient direct method to solve an equation Ax = b, where A is a sparse matrix, on the Intel®Xeon PhiTM coprocessor. The main challenge for such a system is how to engage all available ...The paper describes an efficient direct method to solve an equation Ax = b, where A is a sparse matrix, on the Intel®Xeon PhiTM coprocessor. The main challenge for such a system is how to engage all available threads (about 240) and how to reduce OpenMP* synchronization overhead, which is very expensive for hundreds of threads. The method consists of decomposing A into a product of lower-triangular, diagonal, and upper triangular matrices followed by solves of the resulting three subsystems. The main idea is based on the hybrid parallel algorithm used in the Intel®Math Kernel Library Parallel Direct Sparse Solver for Clusters [1]. Our implementation exploits a static scheduling algorithm during the factorization step to reduce OpenMP synchronization overhead. To effectively engage all available threads, a three-level approach of parallelization is used. Furthermore, we demonstrate that our implementation can perform up to 100 times better on factorization step and up to 65 times better in terms of overall performance on the 240 threads of the Intel®Xeon PhiTM coprocessor.展开更多
文摘A multifrontal code is introduced for the efficient solution of the linear system of equations arising from the analysis of structures. The factorization phase is reduced into a series of interleaved element assembly and dense matrix operations for which the BLAS3 kernels are used. A similar approach is generalized for the forward and back substitution phases for the efficient solution of structures having multiple load conditions. The program performs all assembly and solution steps in parallel. Examples are presented which demonstrate the code’s performance on single and dual core processor computers.
基金Supported by the National Science Foundation of China(No. 60801039)
文摘In this paper,a set of closed-form formulas for vector Finite Element Method(FEM) to analyze three dimensional electromagnetic problems is presented on the basis of Gaussian quadrature integration scheme.By analyzing the open region problems,the first-order Absorbing Boundary Condition(ABC) is considered as the truncation boundary condition and the equation is carried out in a closed-form.Based on the formulas,the hybrid Expanded Cholesky Method(ECM) and MultiFrontal algorithm(MF) is applied to solve finite element equations.Using the closed-form solution,the elec-tromagnetic field of three dimensional targets can be studied sententiously and accurately.Simulation results show that the presented formulas are successfully and concise,which can be easily used to analyze three dimensional electromagnetic problems.
基金supported by the National Natural Science Foundation of China(GrantNos.61202098,61033009,61170309,91130024,and 11171039)the China Tianyuan Mathematics Youth Fund(GrantNo.11226337)
文摘Based on the two-dimensional three-temperature (2D3T) radiation diffusion equations and its discrete system, using the block diagonal structure of the three-temperature matrix, the reordering and symbolic decomposition parts of the RSMF method are replaced with corresponding block operation in order to improve the solution efficiency. We call this block form method block RSMF (in brief, BRSMF) method. The new BRSMF method not only makes the reordering and symbolic decomposition become more effective, but also keeps the cost of numerical factorization from increasing and ensures the precision of solution very well. The theoretical analysis of the computation complexity about the new BRSMF method shows that the solution efficiency about the BRSMF method is higher than the original RSMF method. The numerical experiments also show that the new BRSMF method is more effective than the original RSMF method.
文摘This paper is concerned with the fast iterative solution of linear systems arising from finite difference discretizations in electromagnetics. The sweeping preconditioner with moving perfectly matched layers previously developed for the Helmholtz equation is adapted for the popular Yee grid scheme for wave propagation in inhomogeneous, anisotropic media. Preliminary numerical results are presented for typical examples.
文摘In this paper, an absorbing Fictitious Boundary Condition (FBC) is presented to generate an iterative Domain Decomposition Method (DDM) for analyzing waveguide problems.The relaxed algorithm is introduced to improve the iterative convergence. And the matrix equations are solved using the multifrontal algorithm. The resulting CPU time is greatly reduced.Finally, a number of numerical examples are given to illustrate its accuracy and efficiency.
文摘This paper describes a method of calculating the Schur complement of a sparse positive definite matrix A. The main idea of this approach is to represent matrix A in the form of an elimination tree using a reordering algorithm like METIS and putting columns/rows for which the Schur complement is needed into the top node of the elimination tree. Any problem with a degenerate part of the initial matrix can be resolved with the help of iterative refinement. The proposed approach is close to the “multifrontal” one which was implemented by Ian Duff and others in 1980s. Schur complement computations described in this paper are available in Intel®Math Kernel Library (Intel®MKL). In this paper we present the algorithm for Schur complement computations, experiments that demonstrate a negligible increase in the number of elements in the factored matrix, and comparison with existing alternatives.
文摘The paper describes an efficient direct method to solve an equation Ax = b, where A is a sparse matrix, on the Intel®Xeon PhiTM coprocessor. The main challenge for such a system is how to engage all available threads (about 240) and how to reduce OpenMP* synchronization overhead, which is very expensive for hundreds of threads. The method consists of decomposing A into a product of lower-triangular, diagonal, and upper triangular matrices followed by solves of the resulting three subsystems. The main idea is based on the hybrid parallel algorithm used in the Intel®Math Kernel Library Parallel Direct Sparse Solver for Clusters [1]. Our implementation exploits a static scheduling algorithm during the factorization step to reduce OpenMP synchronization overhead. To effectively engage all available threads, a three-level approach of parallelization is used. Furthermore, we demonstrate that our implementation can perform up to 100 times better on factorization step and up to 65 times better in terms of overall performance on the 240 threads of the Intel®Xeon PhiTM coprocessor.