Date of Award
6-19-2024
Publication Type
Thesis
Degree Name
M.Sc.
Department
Computer Science
Keywords
Graph;Natural Language Processing;Semantic similarity;Word Embedding;Word Similarity;Word vector
Supervisor
Ziad Kobti
Abstract
In the aspect of information storage, text assumes a central role, necessitating streamlined and effective methods for swift retrieval. Among various text representations, the vector form stands out for its remarkable efficiency, especially when dealing with large datasets. Arranging words that are similar in meaning close to each other in the vectorized representation helps improve system performance in different Natural Language Processing (NLP) tasks. Previous methods, primarily centered on capturing word context through neural language models, have fallen short in delivering high scores for word similarity problems. This thesis investigates the connection between representing words in vector form and the improved performance and accuracy observed in NLP tasks. It introduces a method to represent words as a graph so that their first-order and second-order proximity are preserved, aiming to enhance overall capabilities in semantic representation. Experimental deployment of this technique across diverse text corpora underscores its superiority over conventional word embedding approaches. This method of word representation outperforms traditional word-embedding methods by 2.7 % in multiple intrinsic and extrinsic tasks. The findings contribute to the evolving landscape of semantic representation learning but also illuminate their implications for text classification tasks, especially within the context of dynamic embedding models.
Recommended Citation
Sandhu, Tanvi, "Exploration of Word Embeddings with Graph-Based Context Adaptation for Enhanced Word Vectors" (2024). Electronic Theses and Dissertations. 9500.
https://scholar.uwindsor.ca/etd/9500