摘要
引入边缘相似度概念,利用贪婪算法,解决中英文文件纵向切割后的碎纸片拼接还原问题.对于同时发生纵横向切割的中英文碎纸片,先利用着色反转法对碎纸片文字部分进行反转处理,再利用行聚类筛选法对碎纸片按行匹配度进行分类,最后对每一类碎纸片利用贪婪算法并辅之以人工干预,将碎纸片拼接还原.单面英文碎纸片拼接还原结果表明,该方法人工干预次数少,还原效率高、效果好.
A new conception of the edge similarity is introduced. When a file with Chinese or English font is cut into scraps in the vertical direction, the Greedy Algorithm is an efficient method to restore the file by the edge similarity. When a file is cut into many scraps in both vertical and horizontal directions, it can be recovered in this process. First, the Reversed Tinting Method ( RTM) is applied to reverse the body-size parts on the scraps. Second, all the scraps is classified into some sets by the Row Clustering and Screening( RCS) . Last, using the Greedy Algorithm and with the help of artificial interventions , the scraps in every set can be edge joined into a well orderd line. It shows that this edge joined method has the advantage of fewer times artificial interventions, more efficiency and better effect.
出处
《厦门理工学院学报》
2014年第3期103-108,共6页
Journal of Xiamen University of Technology
基金
厦门理工学院科研基金项目(XKJJ201001)
关键词
边缘相似度
行聚类筛选法
着色反转
edge similarity
row clustering and screening
reversed tinting method