Due to the anonymous and free-for-all characteristics of online forums,it is very hard for human beings to differentiate deceptive reviews from truthful reviews.This paper proposes a deep learning approach for text re...Due to the anonymous and free-for-all characteristics of online forums,it is very hard for human beings to differentiate deceptive reviews from truthful reviews.This paper proposes a deep learning approach for text representation called DCWord (Deep Context representation by Word vectors) to deceptive review identification.The basic idea is that since deceptive reviews and truthful reviews are composed by writers without and with real experience on using the online purchased goods or services,there should be different contextual information of words between them.Unlike state-of-the-art techniques in seeking best linguistic features for representation,we use word vectors to characterize contextual information of words in deceptive and truthful reviews automatically.The average-pooling strategy (called DCWord-A) and maxpooling strategy (called DCWord-M) are used to produce review vectors from word vectors.Experimental results on the Spam dataset and the Deception dataset demonstrate that the DCWord-M representation with LR (Logistic Regression) produces the best performances and outperforms state-of-the-art techniques on deceptive review identification.Moreover,the DCWord-M strategy outperforms the DCWord-A strategy in review representation for deceptive review identification.The outcome of this study provides potential implications for online review management and business intelligence of deceptive review identification.展开更多
基金supported in part by National Natural Science Foundation of China under Grant Nos.71932002,61379046,91318302 and 61432001the Innovation Fund Project of Xi'an Science and Technology Program(Special Series for Xi'an University under Grant No.2016CXWL21).
文摘Due to the anonymous and free-for-all characteristics of online forums,it is very hard for human beings to differentiate deceptive reviews from truthful reviews.This paper proposes a deep learning approach for text representation called DCWord (Deep Context representation by Word vectors) to deceptive review identification.The basic idea is that since deceptive reviews and truthful reviews are composed by writers without and with real experience on using the online purchased goods or services,there should be different contextual information of words between them.Unlike state-of-the-art techniques in seeking best linguistic features for representation,we use word vectors to characterize contextual information of words in deceptive and truthful reviews automatically.The average-pooling strategy (called DCWord-A) and maxpooling strategy (called DCWord-M) are used to produce review vectors from word vectors.Experimental results on the Spam dataset and the Deception dataset demonstrate that the DCWord-M representation with LR (Logistic Regression) produces the best performances and outperforms state-of-the-art techniques on deceptive review identification.Moreover,the DCWord-M strategy outperforms the DCWord-A strategy in review representation for deceptive review identification.The outcome of this study provides potential implications for online review management and business intelligence of deceptive review identification.