摘要
浏览器指纹技术凭借其无状态、跨域一致等优点,已经被许多网站应用到用户追踪、广告投放和安全验证等方面。浏览器指纹识别的过程是典型的不平衡数据的分类过程。针对当前浏览器指纹长期追踪过程中存在数据样本类不平衡导致指纹识别准确度低、长期追踪易失效等问题,提出了改进的Self-paced Ensemble(Improved SPE,ISPE)方法应用于浏览器指纹识别。对浏览器指纹样本欠采样过程和集成学习单个分类器的训练过程进行了改进,重点针对难以识别的浏览器指纹,添加类注意力机制并优化自协调因子,使分类器在训练和识别浏览器指纹的过程中更加注重边界样本的分类效果,从而提升总体的浏览器指纹识别准确度。在所收集的3 483条指纹和开源数据集中的15 000条指纹上进行了实验,结果表明,ISPE算法在浏览器指纹匹配识别的F1-score达到95.6%,相比Bi-RNN算法提高了16.8%。
Browser fingerprinting technology has been used by many websites for user tracking,advertising delivery and security verification due to its stateless,cross-domain consistency and other advantages.The task of browser fingerprint recognition is a typical classification task of imbalanced data.The data imbalance exists in browser fingerprint long-term tracking task,which will lead to low accuracy of fingerprint recognition and failure of long-term tracking.An improved Self-paced Ensemble(ISPE)method is proposed to identify browser fingerprints.And the undersampling process of browser fingerprint sample and the training process of single classifier in ensemble learning are improved.Focusing on the browser fingerprint which is difficult to identify,added attention-like mechanism and self-paced factor are optimized to make the classifier pay more attention to the boundary samples which are difficult to classify in the training process,to improve the overall accuracy of browser fingerprint recognition.The results show that the F1-score of ISPE algorithm for browser fingerprint recognition reaches 95.6%,which is 16.8%higher than that of Bi-RNN algorithm.It proves that the method has excellent performance for long-term browser fingerprint tracking.
作者
张德升
陈博
张建辉
卜佑军
孙重鑫
孙嘉
ZHANG Desheng;CHEN Bo;ZHANG Jianhui;BU Youjun;SUN Chongxin;SUN Jia(School of Cyber Science and Engineering,Zhengzhou University,Zhengzhou,450000,China;Information Technology Institute,PLA Strategic Support Force Information Engineering University,Zhengzhou 450000,China)
出处
《计算机科学》
CSCD
北大核心
2023年第7期317-324,共8页
Computer Science
基金
国家自然科学基金(62176264)。