Message passing interface (MPI) is the de facto standard in writing parallel scientific applications on distributed memory systems. Performance prediction of MPI programs on current or future parallel systems can he...Message passing interface (MPI) is the de facto standard in writing parallel scientific applications on distributed memory systems. Performance prediction of MPI programs on current or future parallel systems can help to find system bottleneck or optimize programs. To effectively analyze and predict performance of a large and complex MPI program, an efficient and accurate communication model is highly needed. A series of communication models have been proposed, such as the LogP model family, which assume that the sending overhead, message transmission, and receiving overhead of a communication is not overlapped and there is a maximum overlap degree between computation and communication. However, this assumption does not always hold for MPI programs because either sending or receiving overhead introduced by MPI implementations can decrease potential overlap for large messages. In this paper, we present a new communication model, named LogGPO, which captures the potential overlap between computation with communication of MPI programs. We design and implement a trace-driven simulator to verify the LogGPO model by predicting performance of point-to-point communication and two real applications CG and Sweep3D. The average prediction errors of LogGPO model are 2.4% and 2.0% for these two applications respectively, while the average prediction errors of LogGP model are 38.3% and 9.1% respectively.展开更多
Parallel processing has become an important way to further increase the computationpower of the computer system. It is obviously desirable that the parallel computers couldallow easy, efficient and flexible use of par...Parallel processing has become an important way to further increase the computationpower of the computer system. It is obviously desirable that the parallel computers couldallow easy, efficient and flexible use of parallelism. Recently, technological factors areforcing a convergence towards parallel systems formed by a collection of essentially completecomputers connected by a communication network. This kind of network-based展开更多
Too high energy consumption is widely recognized to be a critical problem in large-scale parallel computing systems.The LogP-based energy-saving model and the frequency scaling method were proposed to reduce energy co...Too high energy consumption is widely recognized to be a critical problem in large-scale parallel computing systems.The LogP-based energy-saving model and the frequency scaling method were proposed to reduce energy consumption analytically and systematically for other two representative barrier algorithms:tournament barrier and central counter barrier.Furthermore,energy optimization methods of these two barrier algorithms were implemented on parallel computing platform.The experimental results validate the effectiveness of the energy optimization methods.67.12% and 70.95% energy savings are obtained respectively for tournament barrier and central counter barrier on platforms with 2048 processes with 1.55%?8.80% performance loss.Furthermore,LogP-based energy-saving analytical model for these two barrier algorithms is highly accurate as the predicted energy savings are within 9.67% of the results obtained by simulation.展开更多
基金Supported by the National High-Tech Research & Development Program of China (Grant No. 2006AA01A105)
文摘Message passing interface (MPI) is the de facto standard in writing parallel scientific applications on distributed memory systems. Performance prediction of MPI programs on current or future parallel systems can help to find system bottleneck or optimize programs. To effectively analyze and predict performance of a large and complex MPI program, an efficient and accurate communication model is highly needed. A series of communication models have been proposed, such as the LogP model family, which assume that the sending overhead, message transmission, and receiving overhead of a communication is not overlapped and there is a maximum overlap degree between computation and communication. However, this assumption does not always hold for MPI programs because either sending or receiving overhead introduced by MPI implementations can decrease potential overlap for large messages. In this paper, we present a new communication model, named LogGPO, which captures the potential overlap between computation with communication of MPI programs. We design and implement a trace-driven simulator to verify the LogGPO model by predicting performance of point-to-point communication and two real applications CG and Sweep3D. The average prediction errors of LogGPO model are 2.4% and 2.0% for these two applications respectively, while the average prediction errors of LogGP model are 38.3% and 9.1% respectively.
基金Project supported by the National Defense Commission of Sceince and Industry.
文摘Parallel processing has become an important way to further increase the computationpower of the computer system. It is obviously desirable that the parallel computers couldallow easy, efficient and flexible use of parallelism. Recently, technological factors areforcing a convergence towards parallel systems formed by a collection of essentially completecomputers connected by a communication network. This kind of network-based
基金Projects(60903044,61170049) supported by National Natural Science Foundation of China
文摘Too high energy consumption is widely recognized to be a critical problem in large-scale parallel computing systems.The LogP-based energy-saving model and the frequency scaling method were proposed to reduce energy consumption analytically and systematically for other two representative barrier algorithms:tournament barrier and central counter barrier.Furthermore,energy optimization methods of these two barrier algorithms were implemented on parallel computing platform.The experimental results validate the effectiveness of the energy optimization methods.67.12% and 70.95% energy savings are obtained respectively for tournament barrier and central counter barrier on platforms with 2048 processes with 1.55%?8.80% performance loss.Furthermore,LogP-based energy-saving analytical model for these two barrier algorithms is highly accurate as the predicted energy savings are within 9.67% of the results obtained by simulation.