Date of Award
2011
Publication Type
Master Thesis
Degree Name
M.Sc.
Department
Computer Science
Keywords
Applied sciences
Supervisor
Alioune Ngom
Rights
info:eu-repo/semantics/openAccess
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Abstract
We address the concept of linear separability of gene expression data sets with respect to two classes, which has been recently studied in the literature. The problem is to efficiently find all pairs of genes which induce a linear separation of the data. We study the Containment Angle (CA) defined on the unit circle for a linearly separating gene-pair (LS-pair) as an alternative to the paired t-test ranking function for gene selection. Using the CA we also show empirically that a given classifier's error is related to the degree of linear separability of a given data set. Finally we propose gene subset selection methods based on the CA ranking function for LS-pairs and a ranking function for linearly separation genes (LS-genes), and which select only among LS-genes and LS-pairs. Overall, our proposed methods give better results in terms of subset sizes and classification accuracy when compared to well-performing methods, on many gene expression data sets.
Recommended Citation
Jafarian, Amirali, "Gene Subset Selection Approaches Based on Linear Separability" (2011). Electronic Theses and Dissertations. 7908.
https://scholar.uwindsor.ca/etd/7908