The general m-machine permutation flowshop problem with the total flow-time objective is known to be NP-hard for m ≥ 2. The only practical method for finding optimal solutions has been branch-and-bound algorithms. In...The general m-machine permutation flowshop problem with the total flow-time objective is known to be NP-hard for m ≥ 2. The only practical method for finding optimal solutions has been branch-and-bound algorithms. In this paper, we present an improved sequential algorithm which is based on a strict alternation of Generation and Exploration execution modes as well as Depth-First/Best-First hybrid strategies. The experimental results show that the proposed scheme exhibits improved performance compared with the algorithm in [1]. More importantly, our method can be easily extended and implemented with lightweight threads to speed up the execution times. Good speedups can be obtained on shared-memory multicore systems.展开更多
为满足复兴号系列中国标准动车组大批量生产需求,简化测试人员操作,依据牵引控制单元(Traction Control Unit,TCU)例行试验项点,并参考原手动测试台测试原理,设计研发了一套牵引控制单元自动化测试系统。该自动化测试系统硬件以PXI测控...为满足复兴号系列中国标准动车组大批量生产需求,简化测试人员操作,依据牵引控制单元(Traction Control Unit,TCU)例行试验项点,并参考原手动测试台测试原理,设计研发了一套牵引控制单元自动化测试系统。该自动化测试系统硬件以PXI测控计算机为核心,配合外部测量设备和信号产生设备;测试序列使用LabVIEW语言编程;为减少测试时间,充分利用多核CPU资源,对测试序列进行并行化处理。该自动化测试系统已在复兴号350公里中国标准动车组TCU上进行了验证,测试结果表明该测试系统能大幅缩减测试时间、准确识别故障点、实现测试报告的自动生成,具有较强的可靠性和实用性。展开更多
We put forward a multicore parallel plan for 2D-FFT and implement it on TMS320C6678 DSP after we research thecharacteristics of different multicore DSP programming models and two-dimension FFT (2D-FFT). We bring the...We put forward a multicore parallel plan for 2D-FFT and implement it on TMS320C6678 DSP after we research thecharacteristics of different multicore DSP programming models and two-dimension FFT (2D-FFT). We bring the parallelcomputing capability of multicore DSP into full play and improve working efficiency of 2D-FFT. It has hugely referential valuein implementing image processing arithmetic based on 2D-FFT.展开更多
Direct Simulation Monte Carlo(DSMC)solves the Boltzmann equation with large Knudsen number.The Boltzmann equation generally consists of three terms:the force term,the diffusion term and the collision term.While the fi...Direct Simulation Monte Carlo(DSMC)solves the Boltzmann equation with large Knudsen number.The Boltzmann equation generally consists of three terms:the force term,the diffusion term and the collision term.While the first two terms of the Boltzmann equation can be discretized by numerical methods such as the finite volume method,the third term can be approximated by DSMC,and DSMC simulates the physical behaviors of gas molecules.However,because of the low sampling efficiency of Monte Carlo Simulation in DSMC,this part usually occupies large portion of computational costs to solve the Boltzmann equation.In this paper,by Markov Chain Monte Carlo(MCMC)and multicore programming,we develop Direct Simulation Multi-Chain Markov Chain Monte Carlo(DSMC3):a fast solver to calculate the numerical solution for the Boltzmann equation.Computational results show that DSMC3 is significantly faster than the conventional method DSMC.展开更多
当前随着多核计算机硬件系统已经成为应用主流,软件开发者需要设计适合多核计算机硬件系统的软件系统。然而如何有效地使用多核硬件系统将成为很大的挑战。开发人员使用基于操作系统线程级开发模型将遇到很大的挑战。为有效地应对以上...当前随着多核计算机硬件系统已经成为应用主流,软件开发者需要设计适合多核计算机硬件系统的软件系统。然而如何有效地使用多核硬件系统将成为很大的挑战。开发人员使用基于操作系统线程级开发模型将遇到很大的挑战。为有效地应对以上问题,Intel公司开发出了适合多核计算机硬件系统的开发编程模型:TBB, ArBB and Cilk等编程模型。最近一种新型的简单而有效的适合多核计算机硬件系统编程的模型“Concurrent Collections”简称“CnC”被Intel公司开发出来。CnC采用声明式编程语言允许应用程序开发者表达一个高层次的计算方法。在本文中,我们将描述如何使用这个新型的编程模型实现一个高性能的数据压缩程序,同时与其他方式实现的并行实现方法进行比较。本文采用双至强处理器X54603.16GHz 8-thread CPUs,通过本文说明的方法实现的并行压缩应用程序运行加速度超过8倍。通过与其他并行实现方式比较OpenMP, TBB and Cilk,本文实现的性能比其他实现方式有5%~10%的性能提升。展开更多
文摘The general m-machine permutation flowshop problem with the total flow-time objective is known to be NP-hard for m ≥ 2. The only practical method for finding optimal solutions has been branch-and-bound algorithms. In this paper, we present an improved sequential algorithm which is based on a strict alternation of Generation and Exploration execution modes as well as Depth-First/Best-First hybrid strategies. The experimental results show that the proposed scheme exhibits improved performance compared with the algorithm in [1]. More importantly, our method can be easily extended and implemented with lightweight threads to speed up the execution times. Good speedups can be obtained on shared-memory multicore systems.
文摘为满足复兴号系列中国标准动车组大批量生产需求,简化测试人员操作,依据牵引控制单元(Traction Control Unit,TCU)例行试验项点,并参考原手动测试台测试原理,设计研发了一套牵引控制单元自动化测试系统。该自动化测试系统硬件以PXI测控计算机为核心,配合外部测量设备和信号产生设备;测试序列使用LabVIEW语言编程;为减少测试时间,充分利用多核CPU资源,对测试序列进行并行化处理。该自动化测试系统已在复兴号350公里中国标准动车组TCU上进行了验证,测试结果表明该测试系统能大幅缩减测试时间、准确识别故障点、实现测试报告的自动生成,具有较强的可靠性和实用性。
文摘We put forward a multicore parallel plan for 2D-FFT and implement it on TMS320C6678 DSP after we research thecharacteristics of different multicore DSP programming models and two-dimension FFT (2D-FFT). We bring the parallelcomputing capability of multicore DSP into full play and improve working efficiency of 2D-FFT. It has hugely referential valuein implementing image processing arithmetic based on 2D-FFT.
文摘Direct Simulation Monte Carlo(DSMC)solves the Boltzmann equation with large Knudsen number.The Boltzmann equation generally consists of three terms:the force term,the diffusion term and the collision term.While the first two terms of the Boltzmann equation can be discretized by numerical methods such as the finite volume method,the third term can be approximated by DSMC,and DSMC simulates the physical behaviors of gas molecules.However,because of the low sampling efficiency of Monte Carlo Simulation in DSMC,this part usually occupies large portion of computational costs to solve the Boltzmann equation.In this paper,by Markov Chain Monte Carlo(MCMC)and multicore programming,we develop Direct Simulation Multi-Chain Markov Chain Monte Carlo(DSMC3):a fast solver to calculate the numerical solution for the Boltzmann equation.Computational results show that DSMC3 is significantly faster than the conventional method DSMC.
文摘当前随着多核计算机硬件系统已经成为应用主流,软件开发者需要设计适合多核计算机硬件系统的软件系统。然而如何有效地使用多核硬件系统将成为很大的挑战。开发人员使用基于操作系统线程级开发模型将遇到很大的挑战。为有效地应对以上问题,Intel公司开发出了适合多核计算机硬件系统的开发编程模型:TBB, ArBB and Cilk等编程模型。最近一种新型的简单而有效的适合多核计算机硬件系统编程的模型“Concurrent Collections”简称“CnC”被Intel公司开发出来。CnC采用声明式编程语言允许应用程序开发者表达一个高层次的计算方法。在本文中,我们将描述如何使用这个新型的编程模型实现一个高性能的数据压缩程序,同时与其他方式实现的并行实现方法进行比较。本文采用双至强处理器X54603.16GHz 8-thread CPUs,通过本文说明的方法实现的并行压缩应用程序运行加速度超过8倍。通过与其他并行实现方式比较OpenMP, TBB and Cilk,本文实现的性能比其他实现方式有5%~10%的性能提升。