Date of Award

1-17-2024

Publication Type

Thesis

Degree Name

M.Sc.

Department

Computer Science

Keywords

Fake News;Knowledge Graphs;Large Language Models;Natural Language Processing;Social Media

Supervisor

Dan Wu

Rights

info:eu-repo/semantics/openAccess

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

The spread of false or misleading information as news has been a significant threat to governments, organizations and the economy for a long time. However, it has become more prevalent and influential in recent years due to the growing popularity of social media, which is now the primary source of information for more than half of the world’s population. Detecting fake news used to rely mostly on statistical and linguistic analysis of texts, but with the advancement of AI and computer-assisted writing tools, fake news authors can now deceive statistical models. Therefore, more sophisticated methods that use document representations from Pre-Trained Language Models and Knowledge Graphs have been developed to capture the context and knowledge behind fake news, known as knowledge-informed news classification methods. This thesis proposes a novel knowledge-informed fake news detection method that combines knowledge graphs with large language models(LLMs). Even though there is plenty of research using smaller pre-trained language models such as BERT for fake news detection, there has yet to be significant research utilizing LLMs for fake news detection. Our proposed methodology for fake news classification that combines knowledge graphs and large language models utilizes knowledge graphs to generate document representations enriched with external knowledge relevant to the subject matter of the news article. Concurrently, LLMs are utilized to generate context-aware, knowledge-enriched document representations from the body of the news article. Our research comprises a series of experiments designed to assess the effectiveness of document representations generated by LLMs, such as T5, Ernie, and GPT, in the context of fake news detection tasks. We also tried to evaluate the efficacy of our proposed methodology by experimenting with 21 unique combinations of LLMs and knowledge graph incorporation techniques. Notably, our model, which combines word embeddings from the T5 large language model with knowledge graph embeddings from the SimplE Knowledge Graph Model within our proposed framework, outperformed existing models on the FakeNewsNet dataset and demonstrated competitive performance on other datasets.

Share

COinS