摘要
针对采用基于token的克隆代码检测方法检测语法相似的克隆代码时存在的部分误检问题,提出一种使用哈希值和标识符冲突率来消除克隆代码检测的部分误检的方法。该方法首先通过语句的哈希值判断语句结构的相似性,然后计算标识符冲突率,通过冲突率的变化,来确定误检消除的方向和消除情况。对于存在误检的克隆代码,最终通过修改克隆代码的相对行号来消除误检。实验结果表明,提出的方法可以消除由于插入结构相同的语句而引起的克隆代码的误检问题,并在此基础上,有效消除了语句形式一样但由于语句顺序颠倒而引起的克隆代码误检问题,提高了克隆代码检测及克隆代码相关缺陷检测的准确性,有利于后续克隆代码重构的研究。
There are some disadvantages when detecting syntax similar clone code with clone code detection method based on token method. To solve these problems, this paper proposes a method to eliminate part false detection of clone code detection with statement hash value and identifier conflict ratio. At first, statement hash value is compared to determine the statement structure similarity. Then the identifier conflict ratio is computed to decide the direction of false detection elimination and resuhs. Finally, the statement relative line number is modified to eliminate false detection. The experimental resuits show that the proposed method can eliminate clone code false detection caused by inserting the same structure statement and the reverse order statements of same structure. It improves the accuracy of clone code detection and clone code re- lated defects, as well as can benefit the study of clone code refactoring.
出处
《智能计算机与应用》
2013年第5期46-49,共4页
Intelligent Computer and Applications
基金
国家自然科学基金(61073052)
高等学校博士学科点专项科研基金(20092302110040)
关键词
克隆代码
哈希值
标识符冲突率
误检
重构
Clone Code
Hash Value
Identifier Conflict Ratio
False Detection
Refactoring