Energy-proportional computing is one of the foremost constraints in the design of next generation exascale systems. These systems must have a very high FLOP-per-watt ratio to be sustainable, which requires tremendous ...Energy-proportional computing is one of the foremost constraints in the design of next generation exascale systems. These systems must have a very high FLOP-per-watt ratio to be sustainable, which requires tremendous improvements in power efficiency for modern computing systems. This paper focuses on the processor—as still the biggest contributor to the power usage—by considering both its core and uncore power subsystems. The uncore describes those processor functions that are not handled by the core, such as L3 cache and on-chip interconnect, and contributes significantly to the total system power. The uncore frequency scaling (UFS) capability has been available to the user since the Intel Haswell processor generation. In this paper, performance and power models are proposed to use both the UFS and dynamic voltage and frequency scaling (DVFS) to reduce the energy consumption in parallel applications. Then, these models are incorporated into a runtime strategy that performs processor frequency scaling during parallel application execution. The strategy can be implemented at the kernel/firmware level, which makes it suitable for improving the energy efficiency of exascale design. Experiments on a 20-core Haswell-EP machine using the quantum chemistry application GAMESS and NAS benchmark resulted in up to 24% energy savings with as little as 2% performance loss.展开更多
To improve the power consumption of parallel applications at the runtime, modern processors provide frequency scaling and power limiting capabilities. In this work, a runtime strategy is proposed to maximize performan...To improve the power consumption of parallel applications at the runtime, modern processors provide frequency scaling and power limiting capabilities. In this work, a runtime strategy is proposed to maximize performance under a given power budget by distributing the available power according to the relative GPU utilization. Time series forecasting methods were used to develop workload prediction models that provide accurate prediction of GPU utilization during application execution. Experiments were performed on a multi-GPU computing platform DGX-1 equipped with eight NVIDIA V100 GPUs used for quantum chemistry calculations in the GAMESS package. For a limited power budget, the proposed strategy may deliver as much as hundred times better GAMESS performance than that obtained when the power is distributed equally among all the GPUs.展开更多
Energy efficiency and energy-proportional computing have become a central focus in modern supercomputers. These supercomputers should provide high throughput per unit of power to be sustainable in terms of operating c...Energy efficiency and energy-proportional computing have become a central focus in modern supercomputers. These supercomputers should provide high throughput per unit of power to be sustainable in terms of operating cost and failure rates. In this paper, a power-bounded strategy is proposed that maximizes parallel application performance under a given power constraint. The strategy dynamically allocates power to core, uncore, and memory power domains within a node to maximize performance under a given power budget. Experiments on a 20-core Haswell-EP platform for a real-world parallel application GAMESS demonstrate that the proposed strategy delivers performance within 4% of the best possible performance for as much as 25% reduction in the minimum power budget required for maximum performance.展开更多
Various characteristics of mesomorphism can be explained using intermolecular interactions between a pair of liquid crystalline molecules. The intermolecular interactions have been calculated considering multipole-mul...Various characteristics of mesomorphism can be explained using intermolecular interactions between a pair of liquid crystalline molecules. The intermolecular interactions have been calculated considering multipole-multicentere expansion method and modified by second order perturbation treatments. For calculation of multipole i.e. charge, dipole, etc. at each atomic center of molecules, para-butyl-p’-cyano-biphenyl, GAMESS, an ab initio program, with 6-31G* basis set has been used. The stacking, in-plane and terminal interaction energies explain the liquid crystalline behaviour of the system.展开更多
文摘Energy-proportional computing is one of the foremost constraints in the design of next generation exascale systems. These systems must have a very high FLOP-per-watt ratio to be sustainable, which requires tremendous improvements in power efficiency for modern computing systems. This paper focuses on the processor—as still the biggest contributor to the power usage—by considering both its core and uncore power subsystems. The uncore describes those processor functions that are not handled by the core, such as L3 cache and on-chip interconnect, and contributes significantly to the total system power. The uncore frequency scaling (UFS) capability has been available to the user since the Intel Haswell processor generation. In this paper, performance and power models are proposed to use both the UFS and dynamic voltage and frequency scaling (DVFS) to reduce the energy consumption in parallel applications. Then, these models are incorporated into a runtime strategy that performs processor frequency scaling during parallel application execution. The strategy can be implemented at the kernel/firmware level, which makes it suitable for improving the energy efficiency of exascale design. Experiments on a 20-core Haswell-EP machine using the quantum chemistry application GAMESS and NAS benchmark resulted in up to 24% energy savings with as little as 2% performance loss.
文摘To improve the power consumption of parallel applications at the runtime, modern processors provide frequency scaling and power limiting capabilities. In this work, a runtime strategy is proposed to maximize performance under a given power budget by distributing the available power according to the relative GPU utilization. Time series forecasting methods were used to develop workload prediction models that provide accurate prediction of GPU utilization during application execution. Experiments were performed on a multi-GPU computing platform DGX-1 equipped with eight NVIDIA V100 GPUs used for quantum chemistry calculations in the GAMESS package. For a limited power budget, the proposed strategy may deliver as much as hundred times better GAMESS performance than that obtained when the power is distributed equally among all the GPUs.
文摘Energy efficiency and energy-proportional computing have become a central focus in modern supercomputers. These supercomputers should provide high throughput per unit of power to be sustainable in terms of operating cost and failure rates. In this paper, a power-bounded strategy is proposed that maximizes parallel application performance under a given power constraint. The strategy dynamically allocates power to core, uncore, and memory power domains within a node to maximize performance under a given power budget. Experiments on a 20-core Haswell-EP platform for a real-world parallel application GAMESS demonstrate that the proposed strategy delivers performance within 4% of the best possible performance for as much as 25% reduction in the minimum power budget required for maximum performance.
文摘Various characteristics of mesomorphism can be explained using intermolecular interactions between a pair of liquid crystalline molecules. The intermolecular interactions have been calculated considering multipole-multicentere expansion method and modified by second order perturbation treatments. For calculation of multipole i.e. charge, dipole, etc. at each atomic center of molecules, para-butyl-p’-cyano-biphenyl, GAMESS, an ab initio program, with 6-31G* basis set has been used. The stacking, in-plane and terminal interaction energies explain the liquid crystalline behaviour of the system.