Date of Award

10-30-2020

Publication Type

Master Thesis

Degree Name

M.Sc.

Department

Computer Science

First Advisor

Luis Rueda

Second Advisor

Sherif Saad Ahmed

Rights

info:eu-repo/semantics/openAccess

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Abstract

Online public reviews have significant influenced customers who purchase products or seek services. Fake reviews are posted online to promote or demote targeted products or reputation of the organizations and businesses. Spam review detection has been the focus of many researchers in recent years. As the online services have been growing rapidly, the importance of the issue is ever increasing and needs to be addressed properly. In this regard, there is a variety of approaches that have been introduced to distinguish truthful reviews from the fake ones. The main features engineered in the past studies typically involve two types of linguistic-based and behavioural-based characteristics of the reviews. Unsupervised, supervised and semisupervised machine learning methods have been widely utilized to perform such a classification. This work introduces a novel technique to detect fake reviews from the genuine ones using linguistic features. Unsupervised learning via self-organizing maps (SOM) in conjunction with a convolutional neural networks (CNN) are employed to perform classification of the reviews. We transform the reviews into images by arranging semantically-similar words around a pixel of the image or equivalently a SOM grid cell. The resulting review images are consequently fed to the CNN for supervised training and then classification. Comprehensive tests on two gold-standard datasets show the effectiveness of the proposed method on single and multi-domain contexts. Observing our results, we deducted that using GloVe 300-dimensional embedding and higher resolution SOM grid maps, our method achieves very good results.

Share

COinS