期刊文献+

基于ElasticSearch的个人敏感信息检测系统 被引量:6

Sensitive Personal Information Detection System Based on ElasticSearch
下载PDF
导出
摘要 个人敏感信息泄露是目前多发的网络安全事件之一,可能危及人身和财产安全,损害公民名誉和身体健康等.本文通过爬虫技术获取网页内容及附件,然后提取其正文并通过ElasticSearch实现全文索引和查询,实现了个人敏感信息的检测.以手机号码为例,采用不同分词器和查询方式对查询效率进行测试后得出结论:通过自定义分词器进行全文索引并使用正则表达式查询进行个人敏感信息检测具有最高的效率. The leakage of the sensitive personal information is one of the most frequent types of network security incidents.Once the sensitive personal information is leaked,it may endanger personal and property safety,and it is likely to damage not only personal reputation,but also physical and mental health.This paper obtains the content and attachments of web pages through the web crawler,and realizes full-text indexing and querying through ElasticSearch,thus realizing the detection of the sensitive personal information.By taking the mobile phone number as an example,the paper uses different tokenizers and query methods to test the query efficiency.It is concluded that it is the most efficient way to detect the sensitive personal information by using the self-defined word segmentation and regular expression query.
作者 张雯 盛颖怡 张晓晴 孟升祥 周蓓 沈健 ZHANG Wen;SHENG Yingyi;ZHANG Xiaoqing;MENG Shengxiang;ZHOU Bei;SHEN Jian(School of Computer Science and Engineering,Changshu Institute of Technology,Changshu 215500,China)
出处 《常熟理工学院学报》 2022年第5期33-36,共4页 Journal of Changshu Institute of Technology
关键词 WEB爬虫 ElasticSearch 个人敏感信息泄露 web crawler ElasticSearch sensitive personal information leakage
  • 相关文献

参考文献5

二级参考文献25

共引文献6

同被引文献46

引证文献6

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部