Date of Award

6-1-2023

Publication Type

Thesis

Degree Name

M.Sc.

Department

Computer Science

Keywords

Distributional Representation;Graph Embeddings;Graphs;Natural Language Processing;Word Embeddings

Supervisor

Ziad Kobti

Rights

info:eu-repo/semantics/openAccess

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

In the domain of Natural Language Processing (NLP), the representation of words according to their distribution in the vector, form is a crucial task. In the representation space, when words that are similar to each other according to human interpretation are placed closer to each other, a notable increase can be observed in the performance and accuracy of NLP tasks. Previous word embedding methods put emphasis on passing the word tokens in an iterative manner through a Neural Language model to capture the context of the words. These methods can capture word relatedness, but only within the given context length. In this thesis, we introduce a method to represent the words in the form of a graph so that their first-order and second-order proximity are preserved, and the relatedness of a word can be captured in a better manner. This graph is then subjected to a vertex (node) embedding method to generate their embedding. After experimenting with the proposed method on multiple text corpora, our findings indicate that this method of word representation outperforms the traditional word-embedding methods by more than 4% in multiple intrinsic tasks and extrinsic tasks.

Share

COinS