Date of Award

9-25-2024

Publication Type

Thesis

Degree Name

M.Sc.

Department

Computer Science

Keywords

Drug Combination;Large Language Models;Polypharmacy Side Effect;SMILES

Supervisor

Alioune Ngom

Abstract

Polypharmacy, the concurrent use of multiple drugs, is a common strategy for treating patients with complex diseases or various conditions. Although consuming a combination of drugs can be beneficial in some cases, it can also lead to unintended drug-drug interactions (DDI) and increased risk of adverse side effects. Predicting these adverse side effects can significantly assist clinicians. In this study, we assess the impact of different language models on generating embeddings for the text-representation of drugs, specifically Simplified Molecular Input Line-Entry System (SMILES), to predict polypharmacy side effects. We first retrieve SMILES sequences of drugs from the PubChem database and then encode these strings using various models, such as ChemBERTa, GPT, BERT, Mol2vec, to obtain representation for each drug. These representations are then fused to create a representation for each drug pair. The drug pair representations are then input into two distinct models separately: a Multilayer Perceptron (MLP), and a Graph Neural Network (GNN), to predict polypharmacy side effects. Our evaluation shows that using these language models with the MLP and GNN results in improved performance compared to our baseline studies. Notably, integrating the embeddings of Fine-tuned ChemBERTa with the GNN architecture yields more effective results than other methods. This study highlights the effectiveness of using complex models like Language Models to generate feature representations based solely on the chemical structures of drugs, even without incorporating other entities such as proteins or cell lines.

Recommended Citation

Hakim, Sadra, "Comparative Analysis of Large Language Models for Polypharmacy Side Effect Prediction" (2024). Electronic Theses and Dissertations. 9548.
https://scholar.uwindsor.ca/etd/9548

Download

Included in

Bioinformatics Commons

COinS

Scholarship at UWindsor

Electronic Theses and Dissertations

Comparative Analysis of Large Language Models for Polypharmacy Side Effect Prediction

Date of Award

Publication Type

Degree Name

Department

Keywords

Supervisor

Abstract

Recommended Citation

Included in

Search

Browse

Author Corner

Scholarship at UWindsor

Electronic Theses and Dissertations

Comparative Analysis of Large Language Models for Polypharmacy Side Effect Prediction

Author

Date of Award

Publication Type

Degree Name

Department

Keywords

Supervisor

Abstract

Recommended Citation

Included in

Share

Search

Browse

Author Corner