Date of Award
2022
Publication Type
Thesis
Degree Name
M.Sc.
Department
Computer Science
Keywords
Cell-cell communication, Graph convolutional neural network, Latent feature approaches, Link prediction, Single-cell RNA-seq, Subgraph embedding
Supervisor
L. Rueda
Supervisor
N. Zhang
Rights
info:eu-repo/semantics/openAccess
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Abstract
Recently, graph-structured data has become increasingly developed in a variety of fields from biological networks to social networks. While link prediction is one of the key problems in graph theory, cell-cell communication regulates individual cell activities and is a crucial part of tissue structure and function. In this regard, recent advances in single-cell RNA sequencing technologies have eased routine analyses of intercellular signaling networks. Previous studies work on various link prediction approaches. These approaches have certain assumptions about when nodes are likely to interact, thus, showing high performance for some specific networks. Subgraph-based methods have solved this problem and outperformed other approaches by extracting local subgraphs from a given network.
In this work, we present a novel method, called Subgraph Embedding of Gene expression matrix for prediction of CEll-cell COmmunication (SEGCECO), which uses an attributed graph convolutional neural network to predict cell-cell communication from single-cell RNA-seq data. SEGCECO captures the latent as well as explicit attributes of undirected, attributed graphs constructed from the gene expression profiles of individual cells. High-dimensional and sparse single-cell RNA-seq data make the process of converting the data to a graphical format a daunting task. We successfully overcome this limitation by applying SoptSC, a similarity-based optimization method in which the cell-cell similarity matrix is learned from single-cell gene expression data. The cell-cell communication network is then built using this similarity matrix.
To evaluate our proposed method, we performed experiments on six scRNAseq datasets extracted from the human and mouse pancreas tissue. Our comparative analysis shows that SEGCECO outperforms latent feature-based approaches, as well as the state-of-the-art method for link prediction, WLNM, with 0.99 ROC area under the curve and 99% prediction accuracy.
Recommended Citation
Hora, Sheena, "SEGCECO: Subgraph Embedding of Gene expression matrix for prediction of CEll-cell COmmunication" (2022). Electronic Theses and Dissertations. 8900.
https://scholar.uwindsor.ca/etd/8900