Date of Award

11-5-2020

Publication Type

Master Thesis

Degree Name

M.Sc.

Department

Computer Science

First Advisor

Jessica Chen

Keywords

Cancer Subtype Classification, Convolutional Neural Networks, Few-shot Learning, RNA-Seq

Rights

info:eu-repo/semantics/openAccess

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Abstract

Diagnosing the correct types of the disease is essential to the effective treatment. The diagnosis may not always be straightforward from the biological tests especially during the early stages of the disease. Human body responds to the disease by producing certain proteins. If we know which genes are active, that is, which proteins are being produced, we can more accurately classify disease subtypes. This study is based on the genetic information extracted from the patient’s biological sample and is used to classify cancer subtypes. Among different types of genetic data, we consider RNA-seq data in this thesis. Studies based on genetic information often suffer from very limited samples and few shot learning has recently been studied for disease classification. Given the success of neural networks in assisting data analysis mostly with large amounts of data, we perform few shot learning by retraining the neural networks with genetic algorithmic processes. We follow the proposal from the Human Genome Organization (HUGO) to group genes based on their chemical composition and apply genetic algorithms to the HUGO gene groups to help retrain the neural networks. We apply our proposed approach to several different cancer datasets and compare our method across state-of-the-art methods. We have implemented our proposed approach and compared its performance with a wide variety of existing methods in machine learning and neural networks on three cancer datasets. According to our experiment, while performing similar to other methods when a relatively larger amount of data is available, our proposed approach outperforms Affinitynet by an average of 4 percent for few-shot learning with small datasets.

Share

COinS