摘要
为提高k-means的大数据量计算速度,结合k-means算法计算密集和计算统一设备架构(CUDA)的特点,提出了寄存器优化的并行聚类算法和滑动门并行计算中心点算法。寄存器优化的并行聚类算法优化了聚类步骤,提高了GPU的寄存器利用率,降低了数据获取延迟;滑动门并行计算中心点算法优化了中心点计算步骤,避免了数据同步,提高了GPU计算核心的利用率。实验结果表明,并行优化的k-means算法在GTX 480上可获最高约137倍的加速比,有效地提高了k-means算法在单机上的运行效率。
To enhance the computation speed of k-means document clustering combining computationally intensive feature, register optimized parallel algorithm for clustering process and sliding doors parallel algorithm for computing center point process are proposed based on compute unified device architecture (CUDA). Register optimized parallel algorithm for clustering process improves utilization rate and reduces data acquisition delay of GPU; Sliding doors parallel algorithm for computing center point process utilizes GPU core much more efficiently while avoiding data synchronization. Experimental results show that the proposed parallel optimization algorithm gets the speed up ratio of more than 137 times and improves the operation efficiency of the k-means algorithm running in the stand-alone environment.
出处
《计算机工程与设计》
CSCD
北大核心
2013年第11期4032-4036,4071,共6页
Computer Engineering and Design
基金
国家自然科学基金项目(61271280
61001100)
"十二五"国家科技支撑计划课题基金项目(2011BAD21B05)