摘要
目前目标识别领域,在人体检测中精确度最高的算法就是可变形部件模型(DPM)算法,针对DPM算法计算量大的缺点,提出了一种基于图形处理器(GPU)的并行化解决方法。采用GPU编程模型OpenCL,对DPM算法的整个算法的实现细节采用了并行化的思想进行重新设计实现,优化算法实现的内存模型和线程分配。通过对Open CV库和采用GPU重新实现的程序进行对比,在保证了检测效果的前提下,使得算法的执行效率有了近8倍的提高。
At present, in the field of target recognition, the highest accuracy algorithm is the Deformable Part Model (DPM) for human detection. Aiming at the disadvantage of large amount of calculation, a parallel solution method based on Graphics Processing Unit (GPU) was proposed. In this paper, with the GPU programming model of OpenCL, the details of the whole DPM algorithm were implemented by the parallel methods, and optimization of the memory model and threads allocation was made. Through the comparison of the OpenCV library and the GPU implementation, under the premise of ensuring the detection effect, the execution efficiency of the algorithm was increased by nearly 8 times.
出处
《计算机应用》
CSCD
北大核心
2015年第11期3075-3078,3129,共5页
journal of Computer Applications
基金
国家自然科学基金资助项目(60970012)
高等学校博士学科点专项科研博导基金资助项目(20113120110008)
上海重点科技攻关项目(14511107902)
上海市工程中心建设项目(GCZX14014)
上海智能家居大规模物联共性技术工程中心项目(GCZX14014)
上海市一流学科建设项目(XTKX2012)
沪江基金研究基地专项(C14001)
关键词
可变形部件模型
图形处理器
OPENCL
人体检测
Deformable Part Model (DPM)
Graphics Processing Unit (GPU)
OpenCL
human detection