期刊文献+

面向FT-M7002平台点积算法的优化实现

Optimization of dot product algorithms on FT-M7002
下载PDF
导出
摘要 基于国产的FT-M7002平台高性能DSP,针对不同类型的点积算法进行了优化实现,完善了该处理器平台数学库的技术链,充分发挥了FT-M7002内核体系结构优势,对点积算法实现了SIMD向量并行化、DMA双通道传输和SVR传输等优化。该研究充分挖掘了程序的向量并行性,有效地提升了数据传输的速度,提高了程序性能。实验结果表明,输入不同规模大小的数组,不同类型的点积算法在FT-M7002平台上优化后和优化前的平均性能比为12.4166~45.2338。相较于TI官网的dsplib库中不同类型的点积函数在TMS320C6678处理器上运行的性能,FT-M7002平台优化后的性能与TI平台的平均性能比为1.3716~4.5196。实验结果表明了该DSP平台相对于TI主流平台的计算性能优势。 On the high-performance DSP of domestic FT-M7002 platform,different types of dot product algorithms are optimized and implemented.The technical chain of the mathematical library of the processor platform is consummated.Taking full advantage of FT-M7002 kernel architecture,SIMD vector parallelization,DMA dual channel transmission,SVR transmission and other optimization methods for dot product algorithm are realized.The research fully excavates the vector parallelism of the program,effectively improving the speed of data transmission and improving the performance of the program.The experimental results show that the average performance ratio of different types of dot product algorithms after and before optimization on FT platform is 12.4166~45.2338.Compared with the performance of different types of dot product functions in dsplib library on TI official website on TMS320C6678 processor,the average performance ratio between FT platform and TI platform is 1.3716~4.5196.The research results show that the DSP platform has obvious computational performance advantages over TI mainstream platform.
作者 郭盼盼 陈梦雪 梁祖达 马晓畅 许邦建 GUO Pan-pan;CHEN Meng-xue;LIANG Zu-da;MA Xiao-chang;XU Bang-jian(School of Computer and Artificial Intelligence,Zhengzhou University,Zhengzhou 450066;National Supercomputing Center in Zhengzhou(Zhengzhou University),Zhengzhou 450001;School of Electrical and Information Engineering,Hunan University,Changsha 410082;School of Information Science and Engineering,Hunan University,Changsha 410082,China)
出处 《计算机工程与科学》 CSCD 北大核心 2022年第11期1909-1917,共9页 Computer Engineering & Science
关键词 FT-M7002 DSP 点积算法 向量 DMA双通道传输 SVR传输 FT-M7002 digital signal processor(DSP) dot product algorithm vector DMA dual channel transmission SVR transmission
  • 相关文献

参考文献16

二级参考文献47

  • 1李鑫.浅谈数字信号处理器DSP的发展和应用[J].硅谷,2008,1(14):28-28. 被引量:6
  • 2邵淑华,张晓红,李国彬.浅析数字信号处理器发展与应用[J].办公自动化,2007,16(18):40-41. 被引量:6
  • 3Chiu Jihching, Chou Yuliang, Hua Yitzeng. A Multi-streaming SIMD Architecture for Multimedia Applications[A]//CF '09 : Proceedings of the 6th ACM conference on Computing frontiers, 2009[C]. New York.. ACM, 2009 : 51-60. 被引量:1
  • 4Parhami B. SIMD Machines..DO They Have a Significant Future [J]. SIGARCH Computer Architecture News, 1995,23 (4) : 19- 22. 被引量:1
  • 5郑纬民.计算机系统结构[M].北京:清华大学出版社,2005:451-479. 被引量:2
  • 6Dersch H. Universal SIMD-Mathlibrary[EB/OL]. (2008 08- 20). http://webuser, fh-furtwangen, de/M 7Edersch/, 2010-6- 30. 被引量:1
  • 7Alex Fr,Introduction to MMX Programming [EB/OL]. (2003- 07-08)E2010-6-303. http://www, codeproject, com/mmxintro. aspx. 被引量:1
  • 8徐晟.cell/BE处理器编程手册[M].北京:电子工业出版社,2009:10-35. 被引量:1
  • 9IBM Systems and Technology Group, SIMI) Math Library Spec ification for Cell Broadband Engine Architecture[EB/OL]. ht- tps://www-01, ibm. eom,2010-6-30. 被引量:1
  • 10Furtak T, Amaral J N, Niewiadomski R. Using SIMD Registers and Instructions to Enable Instruction-Level Parallelism in Sor- ting Algorithms EA]//Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures. 2007 [C]. New York: ACM, 2007 : 348-357. 被引量:1

共引文献40

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部