摘要
针对经典的基于编辑距离的字符串相似度计算方法计算效率低且准确率差的不足,提出一种基于编辑距离和最长公共子串的改进字符串相似度求解方法,引入最长公共前缀和最长公共后缀,定义新的相似度计算公式。将该方法应用于基于异构平台的动态异构web服务系统模型,通过网页篡改检测实验验证,与经典算法和经典公式相比,改进的相似度计算方法能够在适应自身差异性的基础上,提高相似度计算的准确性和计算效率。
To solve the problem of low computational efficiency and poor accuracy of classical string similarity calculation method based on edit distance,an improved string similarity calculation method based on the edit distance and the longest common substring whose calculation formula was defined by introducing the longest common prefix and the longest common suffix was proposed.This method was applied to the dynamic heterogeneous Web server system model based on heterogeneous platform.Through the tamper detection experiment,results show that,compared with classical string similarity calculation method,the improved string similarity calculation method can not only adapt itself to the heterogeneous but also be used to improve the accuracy and the efficiency of the similarity calculation.
出处
《计算机工程与设计》
北大核心
2018年第1期282-287,共6页
Computer Engineering and Design
基金
国家重点研发计划基金项目(2016YFB0800104)
上海科学技术委员会科研计划基金项目(14DZ1105300)
关键词
编辑距离
相似度
动态性
异构性
网页防篡改
edit distance
string similarity
dynamic
heterogeneous
webpages temper-proofing