Identifying microRNA precursors using linear dimensionality reduction with explicit feature mapping

Navid Shakibapour Tabrizi, University of Windsor

Abstract

MicroRNAs are a class of small RNAs of about 20 nucleotides long, which regulate cellular processes in animals and plants. Identifying microRNAs is one of the important tasks in microRNA and transcriptional studies. The main signal that is used for identifying these tiny molecules is the hairpin secondary structure of microRNA precursors. In this research, I propose to use a linear dimensionality reduction(LDR)-based classifier to identify precursor microRNAs from both pseudo hairpins and other non-coding RNAs. LDR has been shown to be widely used in machine learning and pattern recognition problems. Due to the complexity of the data and nature of the problem, linear-based classifiers might not have an acceptable performance. Therefore, I propose to use explicit mapping to project data onto a higher dimensional space in order to increase class separability. Feature selection methods are used in order to reduce the complexity of the classifier and find relevant biological descriptors.