摘要
The automatic and accurate classification of Magnetic Resonance Imaging(MRI)radiology report is essential for the analysis and interpretation epilepsy and non-epilepsy.Since the majority of MRI radiology reports are unstructured,the manual information extraction is time-consuming and requires specific expertise.In this paper,a comprehensive method is proposed to classify epilepsy and non-epilepsy real brain MRI radiology text reports automatically.This method combines the Natural Language Processing technique and statisticalMachine Learning methods.122 realMRI radiology text reports(97 epilepsy,25 non-epilepsy)are studied by our proposed method which consists of the following steps:(i)for a given text report our systems first cleans HTML/XML tags,tokenize,erase punctuation,normalize text,(ii)then it converts into MRI text reports numeric sequences by using indexbased word encoding,(iii)then we applied the deep learning models that are uni-directional long short-term memory(LSTM)network,bidirectional long short-term memory(BiLSTM)network and convolutional neural network(CNN)for the classifying comparison of the data,(iv)finally,we used 70%of used for training,15%for validation,and 15%for test observations.Unlike previous methods,this study encompasses the following objectives:(a)to extract significant text features from radiologic reports of epilepsy disease;(b)to ensure successful classifying accuracy performance to enhance epilepsy data attributes.Therefore,our study is a comprehensive comparative study with the epilepsy dataset obtained from numeric sequences by using index-based word encoding method applied for the deep learning models.The traditionalmethod is numeric sequences by using index-based word encoding which has been made for the first time in the literature,is successful feature descriptor in the epilepsy data set.The BiLSTM network has shown a promising performance regarding the accuracy rates.We show that the larger sizedmedical text reports can be analyzed by our proposed method.