In recent years, the widespread adoption of parallel computing, especially in multi-core processors and high-performance computing environments, ushered in a new era of efficiency and speed. This trend was particularl...In recent years, the widespread adoption of parallel computing, especially in multi-core processors and high-performance computing environments, ushered in a new era of efficiency and speed. This trend was particularly noteworthy in the field of image processing, which witnessed significant advancements. This parallel computing project explored the field of parallel image processing, with a focus on the grayscale conversion of colorful images. Our approach involved integrating OpenMP into our framework for parallelization to execute a critical image processing task: grayscale conversion. By using OpenMP, we strategically enhanced the overall performance of the conversion process by distributing the workload across multiple threads. The primary objectives of our project revolved around optimizing computation time and improving overall efficiency, particularly in the task of grayscale conversion of colorful images. Utilizing OpenMP for concurrent processing across multiple cores significantly reduced execution times through the effective distribution of tasks among these cores. The speedup values for various image sizes highlighted the efficacy of parallel processing, especially for large images. However, a detailed examination revealed a potential decline in parallelization efficiency with an increasing number of cores. This underscored the importance of a carefully optimized parallelization strategy, considering factors like load balancing and minimizing communication overhead. Despite challenges, the overall scalability and efficiency achieved with parallel image processing underscored OpenMP’s effectiveness in accelerating image manipulation tasks.展开更多
This study embarks on a comprehensive examination of optimization techniques within GPU-based parallel programming models,pivotal for advancing high-performance computing(HPC).Emphasizing the transition of GPUs from g...This study embarks on a comprehensive examination of optimization techniques within GPU-based parallel programming models,pivotal for advancing high-performance computing(HPC).Emphasizing the transition of GPUs from graphic-centric processors to versatile computing units,it delves into the nuanced optimization of memory access,thread management,algorithmic design,and data structures.These optimizations are critical for exploiting the parallel processing capabilities of GPUs,addressingboth the theoretical frameworks and practical implementations.By integrating advanced strategies such as memory coalescing,dynamic scheduling,and parallel algorithmic transformations,this research aims to significantly elevate computational efficiency and throughput.The findings underscore the potential of optimized GPU programming to revolutionize computational tasks across various domains,highlighting a pathway towards achieving unparalleled processing power and efficiency in HPC environments.The paper not only contributes to the academic discourse on GPU optimization but also provides actionable insights for developers,fostering advancements in computational sciences and technology.展开更多
In this paper,stochastic global optimization algorithms,specifically,genetic algorithm and simulated annealing are used for the problem of calibrating the dynamic option pricing model under stochastic volatility to ma...In this paper,stochastic global optimization algorithms,specifically,genetic algorithm and simulated annealing are used for the problem of calibrating the dynamic option pricing model under stochastic volatility to market prices by adopting a hybrid programming approach.The performance of this dynamic option pricing model under the obtained optimal parameters is also discussed.To enhance the model throughput and reduce latency,a heterogeneous hybrid programming approach on GPU was adopted which emphasized a data-parallel implementation of the dynamic option pricing model on a GPU-based system.Kernel offloading to the GPU of the compute-intensive segments of the pricing algorithms was done in OpenCL.The GPU approach was found to significantly reduce latency by an optimum of 541 times faster than a parallel implementation approach on the CPU,reducing the computation time from 46.24 minutes to 5.12 seconds.展开更多
Volume visualization can not only illustrate overall distribution but also inner structure and it is an important approach for space environment research.Space environment simulation can produce several correlated var...Volume visualization can not only illustrate overall distribution but also inner structure and it is an important approach for space environment research.Space environment simulation can produce several correlated variables at the same time.However,existing compressed volume rendering methods only consider reducing the redundant information in a single volume of a specific variable,not dealing with the redundant information among these variables.For space environment volume data with multi-correlated variables,based on the HVQ-1d method we propose a further improved HVQ method by compositing variable-specific levels to reduce the redundant information among these variables.The volume data associated with each variable is divided into disjoint blocks of size 43 initially.The blocks are represented as two levels,a mean level and a detail level.The variable-specific mean levels and detail levels are combined respectively to form a larger global mean level and a larger global detail level.To both global levels,a splitting based on a principal component analysis is applied to compute initial codebooks.Then,LBG algorithm is conducted for codebook refinement and quantization.We further take advantage of progressive rendering based on GPU for real-time interactive visualization.Our method has been tested along with HVQ and HVQ-1d on high-energy proton flux volume data,including>5,>10,>30 and>50 MeV integrated proton flux.The results of our experiments prove that the method proposed in this paper pays the least cost of quality at compression,achieves a higher decompression and rendering speed compared with HVQ and provides satisficed fidelity while ensuring interactive rendering speed.展开更多
文摘In recent years, the widespread adoption of parallel computing, especially in multi-core processors and high-performance computing environments, ushered in a new era of efficiency and speed. This trend was particularly noteworthy in the field of image processing, which witnessed significant advancements. This parallel computing project explored the field of parallel image processing, with a focus on the grayscale conversion of colorful images. Our approach involved integrating OpenMP into our framework for parallelization to execute a critical image processing task: grayscale conversion. By using OpenMP, we strategically enhanced the overall performance of the conversion process by distributing the workload across multiple threads. The primary objectives of our project revolved around optimizing computation time and improving overall efficiency, particularly in the task of grayscale conversion of colorful images. Utilizing OpenMP for concurrent processing across multiple cores significantly reduced execution times through the effective distribution of tasks among these cores. The speedup values for various image sizes highlighted the efficacy of parallel processing, especially for large images. However, a detailed examination revealed a potential decline in parallelization efficiency with an increasing number of cores. This underscored the importance of a carefully optimized parallelization strategy, considering factors like load balancing and minimizing communication overhead. Despite challenges, the overall scalability and efficiency achieved with parallel image processing underscored OpenMP’s effectiveness in accelerating image manipulation tasks.
文摘This study embarks on a comprehensive examination of optimization techniques within GPU-based parallel programming models,pivotal for advancing high-performance computing(HPC).Emphasizing the transition of GPUs from graphic-centric processors to versatile computing units,it delves into the nuanced optimization of memory access,thread management,algorithmic design,and data structures.These optimizations are critical for exploiting the parallel processing capabilities of GPUs,addressingboth the theoretical frameworks and practical implementations.By integrating advanced strategies such as memory coalescing,dynamic scheduling,and parallel algorithmic transformations,this research aims to significantly elevate computational efficiency and throughput.The findings underscore the potential of optimized GPU programming to revolutionize computational tasks across various domains,highlighting a pathway towards achieving unparalleled processing power and efficiency in HPC environments.The paper not only contributes to the academic discourse on GPU optimization but also provides actionable insights for developers,fostering advancements in computational sciences and technology.
文摘In this paper,stochastic global optimization algorithms,specifically,genetic algorithm and simulated annealing are used for the problem of calibrating the dynamic option pricing model under stochastic volatility to market prices by adopting a hybrid programming approach.The performance of this dynamic option pricing model under the obtained optimal parameters is also discussed.To enhance the model throughput and reduce latency,a heterogeneous hybrid programming approach on GPU was adopted which emphasized a data-parallel implementation of the dynamic option pricing model on a GPU-based system.Kernel offloading to the GPU of the compute-intensive segments of the pricing algorithms was done in OpenCL.The GPU approach was found to significantly reduce latency by an optimum of 541 times faster than a parallel implementation approach on the CPU,reducing the computation time from 46.24 minutes to 5.12 seconds.
基金the Key Research Program of the Chinese Academy of Sciences(ZDRE-KT-2021-3)。
文摘Volume visualization can not only illustrate overall distribution but also inner structure and it is an important approach for space environment research.Space environment simulation can produce several correlated variables at the same time.However,existing compressed volume rendering methods only consider reducing the redundant information in a single volume of a specific variable,not dealing with the redundant information among these variables.For space environment volume data with multi-correlated variables,based on the HVQ-1d method we propose a further improved HVQ method by compositing variable-specific levels to reduce the redundant information among these variables.The volume data associated with each variable is divided into disjoint blocks of size 43 initially.The blocks are represented as two levels,a mean level and a detail level.The variable-specific mean levels and detail levels are combined respectively to form a larger global mean level and a larger global detail level.To both global levels,a splitting based on a principal component analysis is applied to compute initial codebooks.Then,LBG algorithm is conducted for codebook refinement and quantization.We further take advantage of progressive rendering based on GPU for real-time interactive visualization.Our method has been tested along with HVQ and HVQ-1d on high-energy proton flux volume data,including>5,>10,>30 and>50 MeV integrated proton flux.The results of our experiments prove that the method proposed in this paper pays the least cost of quality at compression,achieves a higher decompression and rendering speed compared with HVQ and provides satisficed fidelity while ensuring interactive rendering speed.