Date of Award
9-20-2024
Publication Type
Thesis
Degree Name
M.Sc.
Department
Computer Science
Keywords
Drug Side Effect Frequency;Fine Tuning;Large Language Models
Supervisor
Alioune Ngom
Abstract
Large language models (LLMs) brought about a paradigm shift in the domain of natural language processing, characterized by their large scale, deep architectures, and pre-training on massive amounts of data, enabling them to learn rich and nuanced representations of language. They have demonstrated impressive performance in natural language understanding tasks across different domains. Recent works have started incorporating LLMs in pharmacological domains such as drug discovery and drug interactions. Drugs play a crucial role in alleviating pain and curing diseases but often come with unintended side effects, which can lead to significant health risks and financial costs. Early detection of these drug side effects during drug development is essential to avoid adverse outcomes. Recent studies have started to focus on a relatively newer problem - predicting the frequencies of given side effects which is an important factor in evaluating therapeutic efficacy. This area, however, remains somewhat underexplored, with only a few studies dedicated to it so far. In this study, we introduce a novel LLM-based architecture that utilizes LLMs to generate embeddings from drug and side effect attributes in order to predict the frequencies of drug side-effects as well as the high frequency drug side effects. We used Galeano's dataset, a standard benchmark dataset for drug side-effect frequency prediction. Our approach utilized different LLMs to generate embeddings and fine-tune them in order to predict the frequencies. Measuring the frequency of the side effects can help determine the therapeutic efficacy of a drug in clinical settings and help weigh the potential risks and benefits of certain drugs. The key objective of this research is to look into the performance of utilizing large language models for predicting the frequencies of drug side effects.
Recommended Citation
Chowdhury, Siyam Sajnan, "LLMPred: Fine-Tuned Large Language Model Embeddings for Drug Side Effect Frequency Prediction" (2024). Electronic Theses and Dissertations. 9541.
https://scholar.uwindsor.ca/etd/9541