A prompt-based approach to adversarial example generation and robustness enhancement

导出

摘要 Recent years have seen the wide application of natural language processing(NLP)models in crucial areas such as finance,medical treatment,and news media,raising concerns about the model robustness and vulnerabilities.We find that prompt paradigm can probe special robust defects of pre-trained language models.Malicious prompt texts are first constructed for inputs and a pre-trained language model can generate adversarial examples for victim models via mask-filling.Experimental results show that prompt paradigm can efficiently generate more diverse adversarial examples besides synonym substitution.Then,we propose a novel robust training approach based on prompt paradigm which incorporates prompt texts as the alternatives to adversarial examples and enhances robustness under a lightweight minimax-style optimization framework.Experiments on three real-world tasks and two deep neural models show that our approach can significantly improve the robustness of models to resist adversarial attacks.

作者 Yuting YANG Pei HUANG Juan CAO Jintao LI Yun LIN Feifei MA

机构地区 Key Lab of Intelligent Information Processing of Chinese Academy of Sciences(CAS) School of Computer Science and Technology Department of Computer Science School of Computing Laboratory of Parallel Software and Computational Science

出处《Frontiers of Computer Science》 SCIE EI CSCD 2024年第4期85-96,共12页 中国计算机科学前沿（英文版）

基金 National Key R&D Program of China(No.2021AAA0140203) Zhejiang Provincial Key Research and Development Program of China(No.2021C01164) National Natural Science Foundation of China(Nos.61972384,62132020,and 62203425).

关键词 ROBUSTNESS adversarial example prompt learning pre-trained language model

分类号 H31 [语言文字—英语]

引文网络
相关文献

1Suofeiya Fan.A Comprehensive Study on Gender Language and Its Differences in China[J].Journal of Contemporary Educational Research,2024,8(7):187-191.

Frontiers of Computer Science

2024年第4期

浏览历史

内容加载中请稍等...

A prompt-based approach to adversarial example generation and robustness enhancement

相关作者

相关机构

相关主题

浏览历史