Date of Award

9-20-2024

Publication Type

Thesis

Degree Name

M.Sc.

Department

Computer Science

Keywords

knowledge graph embedding;language model;machine learning;node classification;representation learning;social network analysis

Supervisor

Ziad Kobti

Abstract

In this research, we explore the important significance of the graph embedding techniques in extracting useful information from graph structures, particularly for node classification tasks. It is essential to comprehend the complex structural and semantic attributes along with the relationships within graph nodes in order to reveal intricate hidden patterns. In order to tackle this challenge, we introduce a hybrid framework called BeComE(Bert-ComplEx Embedding Model), which combines semantic and structural features obtained from social network structures using label-aware embedding models to improve node classification in social graphs. The BeComE model exploits the combined advantages of semantic embeddings obtained from BERT and structural embeddings from ComplEx. BERT produces contextualized embeddings based on the textual node characteristics, capturing intricate semantic details, while ComplEx generates complex-valued embeddings that represent the structural patterns within the graph. A notable aspect of BeComE is its incorporation of label-informed embedding models, which utilize existing node labels to influence the embedding process, guaranteeing that the resulting embeddings are both information-rich and consistent with classification labels. This enhances their ability to accurately classify nodes for various tasks. The BeComE framework merges semantic embeddings from BERT with structural embeddings from ComplEx to create a unified feature vector for each node. This combined representation improves the feature set for classification tasks. A Support Vector Machine classifier, known for its resilience in high-dimensional spaces, is then trained on these merged embeddings to accurately classify nodes. BeComE is assessed using the Cora and CiteSeer datasets, which are commonly used benchmarks in node classification. Our findings demonstrate that BeComE achieves top-notch performance, significantly enhancing classification accuracy by incorporating semantic and structural features through a label-aware embedding process compared to conventional methods relying on a single type of embedding.

Share

COinS