期刊文献+

基于HLS的矩阵求逆算法设计优化 被引量:2

下载PDF
导出
摘要 本文主要研究了HLS多层动态边界循环的优化策略。HLS利用C/C++语言完成算法设计和验证,通过高级综合工具自动生成RTL代码,显著缩短了算法FPGA设计复杂度及实现效率,在信号处理算法实现方面有着显著的优势。但对于具有多层动态循环边界的算法,由于各层循环的数据依赖性及循环边界的不可预知性,HLS难以实现理想的结果。本文以Cholesky分解矩阵求逆算法为例,通过对矩阵求逆计算过程数据计算顺序、数据依赖性、运算步骤进行了分析与理论计算,提出了一种将多层循环优化为单层、两层循环的方法,解决了流水线优化指令高效应用问题。实现结果表明,经过优化后,在资源增加较少的情况下,矩阵求逆延迟性能提升118倍。
出处 《电子技术与软件工程》 2021年第22期93-96,共4页 ELECTRONIC TECHNOLOGY & SOFTWARE ENGINEERING
  • 相关文献

参考文献2

二级参考文献20

  • 1薄华,马缚龙,焦李成.图像纹理的灰度共生矩阵计算问题的分析[J].电子学报,2006,34(1):155-158. 被引量:203
  • 2J Jang, S Choi, V K Prasanna. Area and time efficient imple- mentation of matrix multiplication on FleAs[ A]. Proceedings of the First . International Conference on Field Pro- grammable Technology [ C ]. Piscataway, NJ, United States: IEEE Inc, 2002.93 - 100. 被引量:1
  • 3J Jang,S Choi, V K Prasanna. Energy-efficient matrix multipli- cation on FtK]As [ A ]. Proceedings of the 12th International Conference on Field Programmable Logic and Application [ C ]. Heidelberg, Germany: Springer Vedag, 2002.534 - 544. 被引量:1
  • 4S Choi, V K Pmsanna. Time and energy efficient matrix factor- ization using FtAs[ A]. Proceedings of the 13th International Conference on Field Programmable Logic and Applications [ C ]. Heidelberg, Germany: Springer Vertag, 2003.507 - 519. 被引量:1
  • 5L Zhuo, V K Prasanna. High-performance and parameterized matrix factorization on FPGAs[ A] .Proceedings of the 16th In- ternational Conference on Field Programmable Logic and Ap- plications [ C ]. Heidelberg, Germany: Springer Verlag, 2006.1 --6. 被引量:1
  • 6L Zhuo, V K Prasanna. Hardware/software co-design on recon- figurable computing systems[ A] .Proceedings of the 21st II.Et International Parallel&Distributed Processing Symposium [ C ]. Piscataway, NJ, United States: IEEE Inc,2007.1 - 10. 被引量:1
  • 7D Boland, G A Constantinides. An FleA-based implementa- tion of the MINRF__S algorithm[ A]. Proceedings of the 18th International Conference on Field Programmable Logic and Applications [ C ]. Heidelberg, Germany: Springer Verlag, 2008.379 - 384. 被引量:1
  • 8A R Lopes, G A Constanlinides. A high throughput FA- based floating point conjugate gradient implementation [ A ]. Proceedings of the International Symposium on Applied Re- configurable Computing E C . Heidelberg, Germany: Springer Verlag,2008.75 - 86. 被引量:1
  • 9A R Lopes, A Shahzad, et al. More flops or more precision accuracy parameterizable linear equation solvers for model predictive conlrol[ A] .Proceedings of the 17th IEEE Sympo- sium on Field-Programmable Custom Computing Machines [C]. Piscataway, NJ, United States: IEEE lnc, 2009. 209 - 216. 被引量:1
  • 10Y Dou,S Vassiliadis,et al.64-bit floating-point FtA matrix multiplication[ A] .Proceedings of the 13th ACM/SIGDA In- ternational Symposium on Field Programmable Gate Arrays [ C]. NY, USA: ACM, 2005.86- 95. 被引量:1

共引文献14

同被引文献17

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部