Date of Award

2016

Publication Type

Master Thesis

Degree Name

M.Sc.

Department

Computer Science

Keywords

bad words, feature selection, machine learning, spammer, Twitter

Supervisor

Lu, Jianguo

Rights

info:eu-repo/semantics/openAccess

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Abstract

Large amount of Twitter accounts are suspended. Over ve year period, about 14% accounts are terminated for reasons not speci ed explicitly by the service provider. We collected about 120,000 suspended users, along with their tweets and social re- lations. This thesis studies these suspended users, and compares them with normal users in terms of their tweets. We train classi ers to automatically predict whether a user will be suspended. Three di erent kinds of features are used. We experimented using Nave Bayes method, including Bernoulli (BNB) and multinomial (MNB) plus various feature selection mechanisms (mutual information, chi square and point-wise mutual informa- tion) and achieved F1=78%. To reduce the high dimensions, in our second approach we use word2vec and doc2vec to represent each user with a vector of a shot and xed length and achieved F1 (73%) using SVM with RBF function kernel. Random forest works best with F1=74% on this approach.

Recommended Citation

Cui, Xiutian, "Identifying Suspended Accounts In Twitter" (2016). Electronic Theses and Dissertations. 5725.
https://scholar.uwindsor.ca/etd/5725

Download

COinS

Scholarship at UWindsor

Electronic Theses and Dissertations

Identifying Suspended Accounts In Twitter

Date of Award

Publication Type

Degree Name

Department

Keywords

Supervisor

Rights

Creative Commons License

Abstract

Recommended Citation

Search

Browse

Author Corner

Scholarship at UWindsor

Electronic Theses and Dissertations

Identifying Suspended Accounts In Twitter

Author

Date of Award

Publication Type

Degree Name

Department

Keywords

Supervisor

Rights

Creative Commons License

Abstract

Recommended Citation

Share

Search

Browse

Author Corner