Protein family classification using multiple-class neural networks.
Date of Award
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.
The objective of genomic sequence analysis is to retrieve important information from the vast amount of genomic sequence data, such as DNA, RNA and protein sequences. The main task includes the interpretation of the function of DNA sequence on a genomic scale, the comparisons among genomes to gain insight into the universality of biological mechanisms and into the details of gene structure and function, the determination of the structure of all proteins and protein family classification. With its many features and capabilities for recognition, generalization and classification, artificial neural network technology is well suited for sequence analysis. At the state of the art, many methods have been devised to determine if a given protein sequence is member of a given protein superfamily. This is a binary classification problem, and efficient neural network techniques are mentioned in literature for solving such problem. In this Master's thesis, we consider the problem of classifying given protein sequences into one among at least three protein families using neural networks, and, propose two methods: "Pair-wise Multiple Classification Approach" and "Single Network Approach" for this problem. In "Pair-wise Multiple Classification Approach", several sub-networks are employed to perform the task whereas a compact network system is used in "Single Network Approach". We performed experiments, using SNNS and UOWNNS neural network simulator on our NNs with different input/output representation, and reported accuracies as high as 95%. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2004 .Z54. Source: Masters Abstracts International, Volume: 43-01, page: 0248. Adviser: Alioune Ngom. Thesis (M.Sc.)--University of Windsor (Canada), 2004.
Zhang, Xi., "Protein family classification using multiple-class neural networks." (2004). Electronic Theses and Dissertations. 3213.