In the digital age, phishing attacks have been a persistent security threat leveraged by traditional password management systems that are not able to verify the authenticity of websites. This paper presents an approac...In the digital age, phishing attacks have been a persistent security threat leveraged by traditional password management systems that are not able to verify the authenticity of websites. This paper presents an approach to embedding sophisticated phishing detection within a password manager’s framework, called PhishGuard. PhishGuard uses a Large Language Model (LLM), specifically a fine-tuned BERT algorithm that works in real time, where URLs fed by the user in the credentials are analyzed and authenticated. This approach enhances user security with its provision of real-time protection from phishing attempts. Through rigorous testing, this paper illustrates how PhishGuard has scored well in tests that measure accuracy, precision, recall, and false positive rates.展开更多
提出一种基于页面敏感特征的金融类钓鱼网页检测方法,通过获取网页超文本标记语言特定标签中的文本信息,利用适合中文的多模式匹配算法(AC_SC,AC suitable for Chinese)匹配出敏感文本条数,计算出敏感文本特征值;定位截取网页的logo图像...提出一种基于页面敏感特征的金融类钓鱼网页检测方法,通过获取网页超文本标记语言特定标签中的文本信息,利用适合中文的多模式匹配算法(AC_SC,AC suitable for Chinese)匹配出敏感文本条数,计算出敏感文本特征值;定位截取网页的logo图像,采用PCA-SIFT算法提取图像特征,并与预先建立的网页logo图像库进行匹配,计算出logo图像相似度;基于文本特征值和图像相似度实现对金融类钓鱼网页的判定。实验结果表明,该方法具有很强的针对性和时效性,并能取得不低于97%的召回率。展开更多
Onemust interact with a specific webpage or website in order to use the Internet for communication,teamwork,and other productive activities.However,because phishing websites look benign and not all website visitors ha...Onemust interact with a specific webpage or website in order to use the Internet for communication,teamwork,and other productive activities.However,because phishing websites look benign and not all website visitors have the same knowledge and skills to inspect the trustworthiness of visited websites,they are tricked into disclosing sensitive information and making them vulnerable to malicious software attacks like ransomware.It is impossible to stop attackers fromcreating phishingwebsites,which is one of the core challenges in combating them.However,this threat can be alleviated by detecting a specific website as phishing and alerting online users to take the necessary precautions before handing over sensitive information.In this study,five machine learning(ML)and DL algorithms—cat-boost(CATB),gradient boost(GB),random forest(RF),multilayer perceptron(MLP),and deep neural network(DNN)—were tested with three different reputable datasets and two useful feature selection techniques,to assess the scalability and consistency of each classifier’s performance on varied dataset sizes.The experimental findings reveal that the CATB classifier achieved the best accuracy across all datasets(DS-1,DS-2,and DS-3)with respective values of 97.9%,95.73%,and 98.83%.The GB classifier achieved the second-best accuracy across all datasets(DS-1,DS-2,and DS-3)with respective values of 97.16%,95.18%,and 98.58%.MLP achieved the best computational time across all datasets(DS-1,DS-2,and DS-3)with respective values of 2,7,and 3 seconds despite scoring the lowest accuracy across all datasets.展开更多
Phishing attacks present a persistent and evolving threat in the cybersecurity land-scape,necessitating the development of more sophisticated detection methods.Traditional machine learning approaches to phishing detec...Phishing attacks present a persistent and evolving threat in the cybersecurity land-scape,necessitating the development of more sophisticated detection methods.Traditional machine learning approaches to phishing detection have relied heavily on feature engineering and have often fallen short in adapting to the dynamically changing patterns of phishingUniformResource Locator(URLs).Addressing these challenge,we introduce a framework that integrates the sequential data processing strengths of a Recurrent Neural Network(RNN)with the hyperparameter optimization prowess of theWhale Optimization Algorithm(WOA).Ourmodel capitalizes on an extensive Kaggle dataset,featuring over 11,000 URLs,each delineated by 30 attributes.The WOA’s hyperparameter optimization enhances the RNN’s performance,evidenced by a meticulous validation process.The results,encapsulated in precision,recall,and F1-score metrics,surpass baseline models,achieving an overall accuracy of 92%.This study not only demonstrates the RNN’s proficiency in learning complex patterns but also underscores the WOA’s effectiveness in refining machine learning models for the critical task of phishing detection.展开更多
针对传统方法难以对大规模钓鱼网站进行批量检测的问题,提出基于特征筛选的轻量级层次化检测方法(lightweight hierarchical detection method based on feature filtering,LHFF).该方法首先使用互信息对原始特征集进行筛选,剔除冗余特...针对传统方法难以对大规模钓鱼网站进行批量检测的问题,提出基于特征筛选的轻量级层次化检测方法(lightweight hierarchical detection method based on feature filtering,LHFF).该方法首先使用互信息对原始特征集进行筛选,剔除冗余特征,并将筛选后的特征按照提取特征耗时长短划分为URL特征和网站特征,然后根据划分后的特征,使用轻量级层次化检测框架对钓鱼网站进行检测.实验结果表明,LHFF能够在保障良好检测性能的前提下,减少网站检测所需要的时间,满足对大规模钓鱼网站进行批量检测的需求.展开更多
为利用网站设计的视觉原则并降低钓鱼者修改网页代码组织方式对钓鱼检测的影响,提出基于网页主视觉区域的结构化文档DMVA (document based on main visual area)检测钓鱼网站。提出子间归并算法生成网页的视觉分块;基于用户的视觉行为,...为利用网站设计的视觉原则并降低钓鱼者修改网页代码组织方式对钓鱼检测的影响,提出基于网页主视觉区域的结构化文档DMVA (document based on main visual area)检测钓鱼网站。提出子间归并算法生成网页的视觉分块;基于用户的视觉行为,结合层DOM树的分层结构,提出主视觉区域的思想,获取网页的分层主视觉区域中文本信息,构造DMVA;提出相关网站集,计算待测网站和相关网站集中网页间的DMVA的相似性,检测钓鱼网站。实验结果表明,基于DMVA检测钓鱼网站算法钓鱼检测方法具有较好的准确度。展开更多
文摘In the digital age, phishing attacks have been a persistent security threat leveraged by traditional password management systems that are not able to verify the authenticity of websites. This paper presents an approach to embedding sophisticated phishing detection within a password manager’s framework, called PhishGuard. PhishGuard uses a Large Language Model (LLM), specifically a fine-tuned BERT algorithm that works in real time, where URLs fed by the user in the credentials are analyzed and authenticated. This approach enhances user security with its provision of real-time protection from phishing attempts. Through rigorous testing, this paper illustrates how PhishGuard has scored well in tests that measure accuracy, precision, recall, and false positive rates.
文摘提出一种基于页面敏感特征的金融类钓鱼网页检测方法,通过获取网页超文本标记语言特定标签中的文本信息,利用适合中文的多模式匹配算法(AC_SC,AC suitable for Chinese)匹配出敏感文本条数,计算出敏感文本特征值;定位截取网页的logo图像,采用PCA-SIFT算法提取图像特征,并与预先建立的网页logo图像库进行匹配,计算出logo图像相似度;基于文本特征值和图像相似度实现对金融类钓鱼网页的判定。实验结果表明,该方法具有很强的针对性和时效性,并能取得不低于97%的召回率。
文摘Onemust interact with a specific webpage or website in order to use the Internet for communication,teamwork,and other productive activities.However,because phishing websites look benign and not all website visitors have the same knowledge and skills to inspect the trustworthiness of visited websites,they are tricked into disclosing sensitive information and making them vulnerable to malicious software attacks like ransomware.It is impossible to stop attackers fromcreating phishingwebsites,which is one of the core challenges in combating them.However,this threat can be alleviated by detecting a specific website as phishing and alerting online users to take the necessary precautions before handing over sensitive information.In this study,five machine learning(ML)and DL algorithms—cat-boost(CATB),gradient boost(GB),random forest(RF),multilayer perceptron(MLP),and deep neural network(DNN)—were tested with three different reputable datasets and two useful feature selection techniques,to assess the scalability and consistency of each classifier’s performance on varied dataset sizes.The experimental findings reveal that the CATB classifier achieved the best accuracy across all datasets(DS-1,DS-2,and DS-3)with respective values of 97.9%,95.73%,and 98.83%.The GB classifier achieved the second-best accuracy across all datasets(DS-1,DS-2,and DS-3)with respective values of 97.16%,95.18%,and 98.58%.MLP achieved the best computational time across all datasets(DS-1,DS-2,and DS-3)with respective values of 2,7,and 3 seconds despite scoring the lowest accuracy across all datasets.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2024R 343)PrincessNourah bint Abdulrahman University,Riyadh,Saudi ArabiaDeanship of Scientific Research at Northern Border University,Arar,Kingdom of Saudi Arabia,for funding this researchwork through the project number“NBU-FFR-2024-1092-02”.
文摘Phishing attacks present a persistent and evolving threat in the cybersecurity land-scape,necessitating the development of more sophisticated detection methods.Traditional machine learning approaches to phishing detection have relied heavily on feature engineering and have often fallen short in adapting to the dynamically changing patterns of phishingUniformResource Locator(URLs).Addressing these challenge,we introduce a framework that integrates the sequential data processing strengths of a Recurrent Neural Network(RNN)with the hyperparameter optimization prowess of theWhale Optimization Algorithm(WOA).Ourmodel capitalizes on an extensive Kaggle dataset,featuring over 11,000 URLs,each delineated by 30 attributes.The WOA’s hyperparameter optimization enhances the RNN’s performance,evidenced by a meticulous validation process.The results,encapsulated in precision,recall,and F1-score metrics,surpass baseline models,achieving an overall accuracy of 92%.This study not only demonstrates the RNN’s proficiency in learning complex patterns but also underscores the WOA’s effectiveness in refining machine learning models for the critical task of phishing detection.
文摘针对传统方法难以对大规模钓鱼网站进行批量检测的问题,提出基于特征筛选的轻量级层次化检测方法(lightweight hierarchical detection method based on feature filtering,LHFF).该方法首先使用互信息对原始特征集进行筛选,剔除冗余特征,并将筛选后的特征按照提取特征耗时长短划分为URL特征和网站特征,然后根据划分后的特征,使用轻量级层次化检测框架对钓鱼网站进行检测.实验结果表明,LHFF能够在保障良好检测性能的前提下,减少网站检测所需要的时间,满足对大规模钓鱼网站进行批量检测的需求.
文摘为利用网站设计的视觉原则并降低钓鱼者修改网页代码组织方式对钓鱼检测的影响,提出基于网页主视觉区域的结构化文档DMVA (document based on main visual area)检测钓鱼网站。提出子间归并算法生成网页的视觉分块;基于用户的视觉行为,结合层DOM树的分层结构,提出主视觉区域的思想,获取网页的分层主视觉区域中文本信息,构造DMVA;提出相关网站集,计算待测网站和相关网站集中网页间的DMVA的相似性,检测钓鱼网站。实验结果表明,基于DMVA检测钓鱼网站算法钓鱼检测方法具有较好的准确度。