期刊文献+
共找到536篇文章
< 1 2 27 >
每页显示 20 50 100
面向低资源神经机器翻译的回译方法 被引量:2
1
作者 张文博 张新路 +2 位作者 杨雅婷 董瑞 李晓 《厦门大学学报(自然科学版)》 CAS CSCD 北大核心 2021年第4期675-679,共5页
神经机器翻译在高资源情况下已经获得了巨大的成功,但是对低资源情况翻译效果还有待提高.目前,维吾尔语-汉语(维汉)翻译和蒙古语-汉语(蒙汉)翻译都属于低资源情况下的翻译任务.本文提出将汉语单语数据按照领域相似性划分成多份单语数据... 神经机器翻译在高资源情况下已经获得了巨大的成功,但是对低资源情况翻译效果还有待提高.目前,维吾尔语-汉语(维汉)翻译和蒙古语-汉语(蒙汉)翻译都属于低资源情况下的翻译任务.本文提出将汉语单语数据按照领域相似性划分成多份单语数据,并通过回译方法分段利用不同的单语数据训练翻译模型,然后借助模型平均和模型集成等方法进一步提升维汉和蒙汉翻译质量.使用第16届全国机器翻译大会(CCMT 2020)的评测数据进行实验,结果表明该方法可以有效地提升维汉和蒙汉翻译的翻译质量. 展开更多
关键词 神经机器翻译 低资源语言 回译 领域相似性 预训练
下载PDF
Pre-trained models for natural language processing: A survey 被引量:146
2
作者 QIU XiPeng SUN TianXiang +3 位作者 XU YiGe SHAO YunFan DAI Ning HUANG XuanJing 《Science China(Technological Sciences)》 SCIE EI CAS CSCD 2020年第10期1872-1897,共26页
Recently, the emergence of pre-trained models(PTMs) has brought natural language processing(NLP) to a new era. In this survey, we provide a comprehensive review of PTMs for NLP. We first briefly introduce language rep... Recently, the emergence of pre-trained models(PTMs) has brought natural language processing(NLP) to a new era. In this survey, we provide a comprehensive review of PTMs for NLP. We first briefly introduce language representation learning and its research progress. Then we systematically categorize existing PTMs based on a taxonomy from four different perspectives. Next,we describe how to adapt the knowledge of PTMs to downstream tasks. Finally, we outline some potential directions of PTMs for future research. This survey is purposed to be a hands-on guide for understanding, using, and developing PTMs for various NLP tasks. 展开更多
关键词 deep learning neural network natural language processing pre-trained model distributed representation word embedding self-supervised learning language modelling
原文传递
深度自动编码器的研究与展望 被引量:39
3
作者 曲建岭 杜辰飞 +2 位作者 邸亚洲 高峰 郭超然 《计算机与现代化》 2014年第8期128-134,共7页
深度学习是机器学习的一个分支,开创了神经网络发展的新纪元。作为深度学习结构的主要组成部分之一,深度自动编码器主要用于完成转换学习任务,同时在无监督学习及非线性特征提取过程中也扮演着至关重要的角色。首先介绍深度自动编码器... 深度学习是机器学习的一个分支,开创了神经网络发展的新纪元。作为深度学习结构的主要组成部分之一,深度自动编码器主要用于完成转换学习任务,同时在无监督学习及非线性特征提取过程中也扮演着至关重要的角色。首先介绍深度自动编码器的发展由来、基本概念及原理,然后介绍它的构建方法以及预训练和精雕的一般步骤,并对不同类型深度自动编码器进行总结,最后在深入分析深度自动编码器目前存在的问题的基础上,对其未来发展趋势进行展望。 展开更多
关键词 深度学习 深度自动编码器 预训练 精雕 神经网络
下载PDF
基于ImageNet预训练卷积神经网络的遥感图像检索 被引量:30
4
作者 葛芸 江顺亮 +2 位作者 叶发茂 许庆勇 唐祎玲 《武汉大学学报(信息科学版)》 EI CSCD 北大核心 2018年第1期67-73,共7页
高分辨率遥感图像内容复杂,细节信息丰富,传统的浅层特征在描述这类图像上存在一定难度,容易导致检索中存在较大的语义鸿沟。本文将大规模数据集ImageNet上预训练的4种不同卷积神经网络用于遥感图像检索,首先分别提取4种网络中不同层次... 高分辨率遥感图像内容复杂,细节信息丰富,传统的浅层特征在描述这类图像上存在一定难度,容易导致检索中存在较大的语义鸿沟。本文将大规模数据集ImageNet上预训练的4种不同卷积神经网络用于遥感图像检索,首先分别提取4种网络中不同层次的输出值作为高层特征,再对高层特征进行高斯归一化,然后采用欧氏距离作为相似性度量进行检索。在UC-Merced和WHU-RS数据集上的一系列实验结果表明,4种卷积神经网络的高层特征中,以CNN-M特征的检索性能最好;与视觉词袋和全局形态纹理描述子这两种浅层特征相比,高层特征的检索平均准确率提高了15.7%~25.6%,平均归一化修改检索等级减少了17%~22.1%。因此将ImageNet上预训练的卷积神经网络用于遥感图像检索是一种有效的方法。 展开更多
关键词 遥感图像 检索 卷积神经网路 预训练
原文传递
融合残差注意力机制的UNet视盘分割 被引量:23
5
作者 侯向丹 赵一浩 +3 位作者 刘洪普 郭鸿湧 于习欣 丁梦园 《中国图象图形学报》 CSCD 北大核心 2020年第9期1915-1929,共15页
目的青光眼和病理性近视等会对人的视力造成不可逆的损害,早期的眼科疾病诊断能够大大降低发病率。由于眼底图像的复杂性,视盘分割很容易受到血管和病变等区域的影响,导致传统方法不能精确地分割出视盘。针对这一问题,提出了一种基于深... 目的青光眼和病理性近视等会对人的视力造成不可逆的损害,早期的眼科疾病诊断能够大大降低发病率。由于眼底图像的复杂性,视盘分割很容易受到血管和病变等区域的影响,导致传统方法不能精确地分割出视盘。针对这一问题,提出了一种基于深度学习的视盘分割方法RA-UNet(residual attention UNet),提高了视盘分割精度,实现了自动、端到端的分割。方法在原始UNet基础上进行了改进。使用融合注意力机制的ResNet34作为下采样层来增强图像特征提取能力,加载预训练权重,有助于解决训练样本少导致的过拟合问题。注意力机制可以引入全局上下文信息,增强有用特征并抑制无用特征响应。修改UNet的上采样层,降低模型参数量,帮助模型训练。对网络输出的分割图进行后处理,消除错误样本。同时,使用DiceLoss损失函数替代普通的交叉熵损失函数来优化网络参数。结果在4个数据集上分别与其他方法进行比较,在RIM-ONE(retinal image database for optic nerve evaluation)-R1数据集中,F分数和重叠率分别为0.9574和0.9182,比UNet分别提高了2.89%和5.17%;在RIM-ONE-R3数据集中,F分数和重叠率分别为0.969和0.9398,比UNet分别提高了1.5%和2.78%;在Drishti-GS1数据集中,F分数和重叠率分别为0.9662和0.9345,比UNet分别提高了1.65%和3.04%;在iChallenge-PM病理性近视挑战赛数据集中,F分数和重叠率分别为0.9424和0.8911,分别比UNet提高了3.59%和6.22%。同时还在RIM-ONE-R1和Drishti-GS1中进行了消融实验,验证了改进算法中各个模块均有助于提升视盘分割效果。结论提出的RA-UNet,提升了视盘分割精度,对有病变区域的图像也有良好的视盘分割性能,同时具有良好的泛化性能。 展开更多
关键词 青光眼 UNet 深度学习 视盘分割 预训练 注意力机制 DiceLoss
原文传递
Paradigm Shift in Natural Language Processing 被引量:10
6
作者 Tian-Xiang Sun Xiang-Yang Liu +1 位作者 Xi-Peng Qiu Xuan-Jing Huang 《Machine Intelligence Research》 EI CSCD 2022年第3期169-183,共15页
In the era of deep learning, modeling for most natural language processing (NLP) tasks has converged into several mainstream paradigms. For example, we usually adopt the sequence labeling paradigm to solve a bundle of... In the era of deep learning, modeling for most natural language processing (NLP) tasks has converged into several mainstream paradigms. For example, we usually adopt the sequence labeling paradigm to solve a bundle of tasks such as POS-tagging, named entity recognition (NER), and chunking, and adopt the classification paradigm to solve tasks like sentiment analysis. With the rapid progress of pre-trained language models, recent years have witnessed a rising trend of paradigm shift, which is solving one NLP task in a new paradigm by reformulating the task. The paradigm shift has achieved great success on many tasks and is becoming a promising way to improve model performance. Moreover, some of these paradigms have shown great potential to unify a large number of NLP tasks, making it possible to build a single model to handle diverse tasks. In this paper, we review such phenomenon of paradigm shifts in recent years, highlighting several paradigms that have the potential to solve different NLP tasks. 展开更多
关键词 Natural language processing pre-trained language models deep learning sequence-to-sequence paradigm shift
原文传递
Large-scale Multi-modal Pre-trained Models: A Comprehensive Survey 被引量:7
7
作者 Xiao Wang Guangyao Chen +5 位作者 Guangwu Qian Pengcheng Gao Xiao-Yong Wei Yaowei Wang Yonghong Tian Wen Gao 《Machine Intelligence Research》 EI CSCD 2023年第4期447-482,共36页
With the urgent demand for generalized deep models,many pre-trained big models are proposed,such as bidirectional encoder representations(BERT),vision transformer(ViT),generative pre-trained transformers(GPT),etc.Insp... With the urgent demand for generalized deep models,many pre-trained big models are proposed,such as bidirectional encoder representations(BERT),vision transformer(ViT),generative pre-trained transformers(GPT),etc.Inspired by the success of these models in single domains(like computer vision and natural language processing),the multi-modal pre-trained big models have also drawn more and more attention in recent years.In this work,we give a comprehensive survey of these models and hope this paper could provide new insights and helps fresh researchers to track the most cutting-edge works.Specifically,we firstly introduce the background of multi-modal pre-training by reviewing the conventional deep learning,pre-training works in natural language process,computer vision,and speech.Then,we introduce the task definition,key challenges,and advantages of multi-modal pre-training models(MM-PTMs),and discuss the MM-PTMs with a focus on data,objectives,network architectures,and knowledge enhanced pre-training.After that,we introduce the downstream tasks used for the validation of large-scale MM-PTMs,including generative,classification,and regression tasks.We also give visualization and analysis of the model parameters and results on representative downstream tasks.Finally,we point out possible research directions for this topic that may benefit future works.In addition,we maintain a continuously updated paper list for large-scale pre-trained multi-modal big models:https://github.com/wangxiao5791509/MultiModal_BigModels_Survey. 展开更多
关键词 Multi-modal(MM) pre-trained model(PTM) information fusion representation learning deep learning
原文传递
A Classification–Detection Approach of COVID-19 Based on Chest X-ray and CT by Using Keras Pre-Trained Deep Learning Models 被引量:10
8
作者 Xing Deng Haijian Shao +2 位作者 Liang Shi Xia Wang Tongling Xie 《Computer Modeling in Engineering & Sciences》 SCIE EI 2020年第11期579-596,共18页
The Coronavirus Disease 2019(COVID-19)is wreaking havoc around the world,bring out that the enormous pressure on national health and medical staff systems.One of the most effective and critical steps in the fight agai... The Coronavirus Disease 2019(COVID-19)is wreaking havoc around the world,bring out that the enormous pressure on national health and medical staff systems.One of the most effective and critical steps in the fight against COVID-19,is to examine the patient’s lungs based on the Chest X-ray and CT generated by radiation imaging.In this paper,five keras-related deep learning models:ResNet50,InceptionResNetV2,Xception,transfer learning and pre-trained VGGNet16 is applied to formulate an classification-detection approaches of COVID-19.Two benchmark methods SVM(Support Vector Machine),CNN(Conventional Neural Networks)are provided to compare with the classification-detection approaches based on the performance indicators,i.e.,precision,recall,F1 scores,confusion matrix,classification accuracy and three types of AUC(Area Under Curve).The highest classification accuracy derived by classification-detection based on 5857 Chest X-rays and 767 Chest CTs are respectively 84%and 75%,which shows that the keras-related deep learning approaches facilitate accurate and effective COVID-19-assisted detection. 展开更多
关键词 COVID-19 detection deep learning transfer learning pre-trained models
下载PDF
An Approach to Detect Structural Development Defects in Object-Oriented Programs
9
作者 Maxime Seraphin Gnagne Mouhamadou Dosso +1 位作者 Mamadou Diarra Souleymane Oumtanaga 《Open Journal of Applied Sciences》 2024年第2期494-510,共17页
Structural development defects essentially refer to code structure that violates object-oriented design principles. They make program maintenance challenging and deteriorate software quality over time. Various detecti... Structural development defects essentially refer to code structure that violates object-oriented design principles. They make program maintenance challenging and deteriorate software quality over time. Various detection approaches, ranging from traditional heuristic algorithms to machine learning methods, are used to identify these defects. Ensemble learning methods have strengthened the detection of these defects. However, existing approaches do not simultaneously exploit the capabilities of extracting relevant features from pre-trained models and the performance of neural networks for the classification task. Therefore, our goal has been to design a model that combines a pre-trained model to extract relevant features from code excerpts through transfer learning and a bagging method with a base estimator, a dense neural network, for defect classification. To achieve this, we composed multiple samples of the same size with replacements from the imbalanced dataset MLCQ1. For all the samples, we used the CodeT5-small variant to extract features and trained a bagging method with the neural network Roberta Classification Head to classify defects based on these features. We then compared this model to RandomForest, one of the ensemble methods that yields good results. Our experiments showed that the number of base estimators to use for bagging depends on the defect to be detected. Next, we observed that it was not necessary to use a data balancing technique with our model when the imbalance rate was 23%. Finally, for blob detection, RandomForest had a median MCC value of 0.36 compared to 0.12 for our method. However, our method was predominant in Long Method detection with a median MCC value of 0.53 compared to 0.42 for RandomForest. These results suggest that the performance of ensemble methods in detecting structural development defects is dependent on specific defects. 展开更多
关键词 Object-Oriented Programming Structural Development Defect Detection Software Maintenance pre-trained Models Features Extraction BAGGING Neural Network
下载PDF
May ChatGPT be a tool producing medical information for common inflammatory bowel disease patients’questions?An evidencecontrolled analysis 被引量:1
10
作者 Antonietta Gerarda Gravina Raffaele Pellegrino +6 位作者 Marina Cipullo Giovanna Palladino Giuseppe Imperio Andrea Ventura Salvatore Auletta Paola Ciamarra Alessandro Federico 《World Journal of Gastroenterology》 SCIE CAS 2024年第1期17-33,共17页
Artificial intelligence is increasingly entering everyday healthcare.Large language model(LLM)systems such as Chat Generative Pre-trained Transformer(ChatGPT)have become potentially accessible to everyone,including pa... Artificial intelligence is increasingly entering everyday healthcare.Large language model(LLM)systems such as Chat Generative Pre-trained Transformer(ChatGPT)have become potentially accessible to everyone,including patients with inflammatory bowel diseases(IBD).However,significant ethical issues and pitfalls exist in innovative LLM tools.The hype generated by such systems may lead to unweighted patient trust in these systems.Therefore,it is necessary to understand whether LLMs(trendy ones,such as ChatGPT)can produce plausible medical information(MI)for patients.This review examined ChatGPT’s potential to provide MI regarding questions commonly addressed by patients with IBD to their gastroenterologists.From the review of the outputs provided by ChatGPT,this tool showed some attractive potential while having significant limitations in updating and detailing information and providing inaccurate information in some cases.Further studies and refinement of the ChatGPT,possibly aligning the outputs with the leading medical evidence provided by reliable databases,are needed. 展开更多
关键词 Crohn’s disease Ulcerative colitis Inflammatory bowel disease Chat Generative pre-trained Transformer Large language model Artificial intelligence
下载PDF
The Life Cycle of Knowledge in Big Language Models:A Survey 被引量:1
11
作者 Boxi Cao Hongyu Lin +1 位作者 Xianpei Han Le Sun 《Machine Intelligence Research》 EI CSCD 2024年第2期217-238,共22页
Knowledge plays a critical role in artificial intelligence.Recently,the extensive success of pre-trained language models(PLMs)has raised significant attention about how knowledge can be acquired,maintained,updated and... Knowledge plays a critical role in artificial intelligence.Recently,the extensive success of pre-trained language models(PLMs)has raised significant attention about how knowledge can be acquired,maintained,updated and used by language models.Despite the enormous amount of related studies,there is still a lack of a unified view of how knowledge circulates within language models throughout the learning,tuning,and application processes,which may prevent us from further understanding the connections between current progress or realizing existing limitations.In this survey,we revisit PLMs as knowledge-based systems by dividing the life circle of knowledge in PLMs into five critical periods,and investigating how knowledge circulates when it is built,maintained and used.To this end,we systematically review existing studies of each period of the knowledge life cycle,summarize the main challenges and current limitations,and discuss future directions. 展开更多
关键词 pre-trained language model knowledge acquisition knowledge representation knowledge probing knowledge editing knowledge application
原文传递
y-Tuning: an efficient tuning paradigm for large-scale pre-trained models via label representation learning
12
作者 Yitao LIU Chenxin AN Xipeng QIU 《Frontiers of Computer Science》 SCIE EI CSCD 2024年第4期107-116,共10页
With current success of large-scale pre-trained models(PTMs),how efficiently adapting PTMs to downstream tasks has attracted tremendous attention,especially for PTMs with billions of parameters.Previous work focuses o... With current success of large-scale pre-trained models(PTMs),how efficiently adapting PTMs to downstream tasks has attracted tremendous attention,especially for PTMs with billions of parameters.Previous work focuses on designing parameter-efficient tuning paradigms but needs to save and compute the gradient of the whole computational graph.In this paper,we propose y-Tuning,an efficient yet effective paradigm to adapt frozen large-scale PTMs to specific downstream tasks.y-Tuning learns dense representations for labels y defined in a given task and aligns them to fixed feature representation.Without computing the gradients of text encoder at training phrase,y-Tuning is not only parameterefficient but also training-efficient.Experimental results show that for DeBERTaxxL with 1.6 billion parameters,y-Tuning achieves performance more than 96%of full fine-tuning on GLUE Benchmark with only 2%tunable parameters and much fewer training costs. 展开更多
关键词 pre-trained model lightweight fine-tuning paradigms label representation
原文传递
基于频谱图转换器的音频场景分类 被引量:2
13
作者 袁双 杨立东 +2 位作者 郭勇 牛大伟 张丹丹 《信号处理》 CSCD 北大核心 2023年第4期730-736,共7页
音频场景分类是场景理解重要的一环,学习音频场景特征并精准分类能加强机器与环境的交互能力,在大数据时代其重要性不言而喻。鉴于分类任务表现依赖数据集规模,但实际任务中又面临数据集严重不足的情况,本文提出了数据增强和网络模型预... 音频场景分类是场景理解重要的一环,学习音频场景特征并精准分类能加强机器与环境的交互能力,在大数据时代其重要性不言而喻。鉴于分类任务表现依赖数据集规模,但实际任务中又面临数据集严重不足的情况,本文提出了数据增强和网络模型预训练策略,将频谱图转换器模型和音频场景分类任务相结合。首先,提取音频信号对数梅尔能量频谱图输入模型,然后通过模型动态交互能力,加强音频序列空间关系,最后由标记向量完成分类。将本文方法在DCASE2019task1和DCASE2020task1公开数据集上进行测试,分类准确率分别达到了96.489%和93.227%,与已有算法相比有明显的提升,说明本方法适用高精度音频场景分类任务,为高精度智能设备感知环境内容、检测环境动态打下了基础。 展开更多
关键词 音频场景分类 转换器 预训练 数据增强
下载PDF
Network Meets ChatGPT:Intent Autonomous Management,Control and Operation 被引量:2
14
作者 Jingyu Wang Lei Zhang +6 位作者 Yiran Yang Zirui Zhuang Qi Qi Haifeng Sun Lu Lu Junlan Feng Jianxin Liao 《Journal of Communications and Information Networks》 EI CSCD 2023年第3期239-255,共17页
Telecommunication has undergone significant transformations due to the continuous advancements in internet technology,mobile devices,competitive pricing,and changing customer preferences.Specifically,the most recent i... Telecommunication has undergone significant transformations due to the continuous advancements in internet technology,mobile devices,competitive pricing,and changing customer preferences.Specifically,the most recent iteration of OpenAI’s large language model chat generative pre-trained transformer(ChatGPT)has the potential to propel innovation and bolster operational performance in the telecommunications sector.Nowadays,the exploration of network resource management,control,and operation is still in the initial stage.In this paper,we propose a novel network artificial intelligence architecture named language model for network traffic(NetLM),a large language model based on a transformer designed to understand sequence structures in the network packet data and capture their underlying dynamics.The continual convergence of knowledge space and artificial intelligence(AI)technologies constitutes the core of intelligent network management and control.Multi-modal representation learning is used to unify the multi-modal information of network indicator data,traffic data,and text data into the same feature space.Furthermore,a NetLM-based control policy generation framework is proposed to refine intent incrementally through different abstraction levels.Finally,some potential cases are provided that NetLM can benefit the telecom industry. 展开更多
关键词 network management and control architecture generative pre-trained transformer intent-based networking NetLM network knowledge
原文传递
Short-term displacement prediction for newly established monitoring slopes based on transfer learning
15
作者 Yuan Tian Yang-landuo Deng +3 位作者 Ming-zhi Zhang Xiao Pang Rui-ping Ma Jian-xue Zhang 《China Geology》 CAS CSCD 2024年第2期351-364,共14页
This study makes a significant progress in addressing the challenges of short-term slope displacement prediction in the Universal Landslide Monitoring Program,an unprecedented disaster mitigation program in China,wher... This study makes a significant progress in addressing the challenges of short-term slope displacement prediction in the Universal Landslide Monitoring Program,an unprecedented disaster mitigation program in China,where lots of newly established monitoring slopes lack sufficient historical deformation data,making it difficult to extract deformation patterns and provide effective predictions which plays a crucial role in the early warning and forecasting of landslide hazards.A slope displacement prediction method based on transfer learning is therefore proposed.Initially,the method transfers the deformation patterns learned from slopes with relatively rich deformation data by a pre-trained model based on a multi-slope integrated dataset to newly established monitoring slopes with limited or even no useful data,thus enabling rapid and efficient predictions for these slopes.Subsequently,as time goes on and monitoring data accumulates,fine-tuning of the pre-trained model for individual slopes can further improve prediction accuracy,enabling continuous optimization of prediction results.A case study indicates that,after being trained on a multi-slope integrated dataset,the TCN-Transformer model can efficiently serve as a pretrained model for displacement prediction at newly established monitoring slopes.The three-day average RMSE is significantly reduced by 34.6%compared to models trained only on individual slope data,and it also successfully predicts the majority of deformation peaks.The fine-tuned model based on accumulated data on the target newly established monitoring slope further reduced the three-day RMSE by 37.2%,demonstrating a considerable predictive accuracy.In conclusion,taking advantage of transfer learning,the proposed slope displacement prediction method effectively utilizes the available data,which enables the rapid deployment and continual refinement of displacement predictions on newly established monitoring slopes. 展开更多
关键词 LANDSLIDE Slope displacement prediction Transfer learning Integrated dataset Transformer pre-trained model Universal Landslide Monitoring Program(ULMP) Geological hazards survey engineering
下载PDF
SHEL:a semantically enhanced hardware-friendly entity linking method
16
作者 亓东林 CHEN Shudong +2 位作者 DU Rong TONG Da YU Yong 《High Technology Letters》 EI CAS 2024年第1期13-22,共10页
With the help of pre-trained language models,the accuracy of the entity linking task has made great strides in recent years.However,most models with excellent performance require fine-tuning on a large amount of train... With the help of pre-trained language models,the accuracy of the entity linking task has made great strides in recent years.However,most models with excellent performance require fine-tuning on a large amount of training data using large pre-trained language models,which is a hardware threshold to accomplish this task.Some researchers have achieved competitive results with less training data through ingenious methods,such as utilizing information provided by the named entity recognition model.This paper presents a novel semantic-enhancement-based entity linking approach,named semantically enhanced hardware-friendly entity linking(SHEL),which is designed to be hardware friendly and efficient while maintaining good performance.Specifically,SHEL's semantic enhancement approach consists of three aspects:(1)semantic compression of entity descriptions using a text summarization model;(2)maximizing the capture of mention contexts using asymmetric heuristics;(3)calculating a fixed size mention representation through pooling operations.These series of semantic enhancement methods effectively improve the model's ability to capture semantic information while taking into account the hardware constraints,and significantly improve the model's convergence speed by more than 50%compared with the strong baseline model proposed in this paper.In terms of performance,SHEL is comparable to the previous method,with superior performance on six well-established datasets,even though SHEL is trained using a smaller pre-trained language model as the encoder. 展开更多
关键词 entity linking(EL) pre-trained models knowledge graph text summarization semantic enhancement
下载PDF
Adapter Based on Pre-Trained Language Models for Classification of Medical Text
17
作者 Quan Li 《Journal of Electronic Research and Application》 2024年第3期129-134,共6页
We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract informa... We present an approach to classify medical text at a sentence level automatically.Given the inherent complexity of medical text classification,we employ adapters based on pre-trained language models to extract information from medical text,facilitating more accurate classification while minimizing the number of trainable parameters.Extensive experiments conducted on various datasets demonstrate the effectiveness of our approach. 展开更多
关键词 Classification of medical text ADAPTER pre-trained language model
下载PDF
基于标签感知注意力的短文本分类方法
18
作者 李大帅 叶成荫 《软件导刊》 2024年第9期110-115,共6页
针对目前短文本分类只是将分类标签作为分类结果判断依据,而忽略了分类标签文本中所蕴含的语义信息这一问题,提出以大规模预训练语言模型为基础的基于标签感知注意力的短文本分类方法。该方法通过大规模预训练语言模型将文本数据表征为... 针对目前短文本分类只是将分类标签作为分类结果判断依据,而忽略了分类标签文本中所蕴含的语义信息这一问题,提出以大规模预训练语言模型为基础的基于标签感知注意力的短文本分类方法。该方法通过大规模预训练语言模型将文本数据表征为分布式向量形式以获得更丰富的语义信息;同时将分类标签信息融入到文本数据训练过程中,通过注意力机制使文本数据感知与分类最相关的信息;使用CNN网络和最大池化层提取局部词级向量特征,以更好地解决英文文本中的双重否定、比较级否定等语义问题;使用残差连接将句级向量与词级向量融合,以有效缓解文本信息衰减问题。在R8、R52和MR3个公共英文数据集上进行测试,实验结果表明,所提方法在R8和R52数据集上的精度分别为98.51%和97.10%,优于DeBERTa和BertGCN。 展开更多
关键词 短文本分类 CNN 标签感知 注意力 预训练
下载PDF
Personality Trait Detection via Transfer Learning
19
作者 Bashar Alshouha Jesus Serrano-Guerrero +2 位作者 Francisco Chiclana Francisco P.Romero Jose A.Olivas 《Computers, Materials & Continua》 SCIE EI 2024年第2期1933-1956,共24页
Personality recognition plays a pivotal role when developing user-centric solutions such as recommender systems or decision support systems across various domains,including education,e-commerce,or human resources.Tra-... Personality recognition plays a pivotal role when developing user-centric solutions such as recommender systems or decision support systems across various domains,including education,e-commerce,or human resources.Tra-ditional machine learning techniques have been broadly employed for personality trait identification;nevertheless,the development of new technologies based on deep learning has led to new opportunities to improve their performance.This study focuses on the capabilities of pre-trained language models such as BERT,RoBERTa,ALBERT,ELECTRA,ERNIE,or XLNet,to deal with the task of personality recognition.These models are able to capture structural features from textual content and comprehend a multitude of language facets and complex features such as hierarchical relationships or long-term dependencies.This makes them suitable to classify multi-label personality traits from reviews while mitigating computational costs.The focus of this approach centers on developing an architecture based on different layers able to capture the semantic context and structural features from texts.Moreover,it is able to fine-tune the previous models using the MyPersonality dataset,which comprises 9,917 status updates contributed by 250 Facebook users.These status updates are categorized according to the well-known Big Five personality model,setting the stage for a comprehensive exploration of personality traits.To test the proposal,a set of experiments have been performed using different metrics such as the exact match ratio,hamming loss,zero-one-loss,precision,recall,F1-score,and weighted averages.The results reveal ERNIE is the top-performing model,achieving an exact match ratio of 72.32%,an accuracy rate of 87.17%,and 84.41%of F1-score.The findings demonstrate that the tested models substantially outperform other state-of-the-art studies,enhancing the accuracy by at least 3%and confirming them as powerful tools for personality recognition.These findings represent substantial advancements in personality recognition,making th 展开更多
关键词 Personality trait detection pre-trained language model big five model transfer learning
下载PDF
A prompt-based approach to adversarial example generation and robustness enhancement
20
作者 Yuting YANG Pei HUANG +3 位作者 Juan CAO Jintao LI Yun LIN Feifei MA 《Frontiers of Computer Science》 SCIE EI CSCD 2024年第4期85-96,共12页
Recent years have seen the wide application of natural language processing(NLP)models in crucial areas such as finance,medical treatment,and news media,raising concerns about the model robustness and vulnerabilities.W... Recent years have seen the wide application of natural language processing(NLP)models in crucial areas such as finance,medical treatment,and news media,raising concerns about the model robustness and vulnerabilities.We find that prompt paradigm can probe special robust defects of pre-trained language models.Malicious prompt texts are first constructed for inputs and a pre-trained language model can generate adversarial examples for victim models via mask-filling.Experimental results show that prompt paradigm can efficiently generate more diverse adversarial examples besides synonym substitution.Then,we propose a novel robust training approach based on prompt paradigm which incorporates prompt texts as the alternatives to adversarial examples and enhances robustness under a lightweight minimax-style optimization framework.Experiments on three real-world tasks and two deep neural models show that our approach can significantly improve the robustness of models to resist adversarial attacks. 展开更多
关键词 ROBUSTNESS adversarial example prompt learning pre-trained language model
原文传递
上一页 1 2 27 下一页 到第
使用帮助 返回顶部