Date of Award

2014

Publication Type

Doctoral Thesis

Degree Name

Ph.D.

Department

Computer Science

Keywords

Computational geometry, Geometric proximity, Linear programming, Protein structure alignment

Supervisor

Mukhopadhyay, Asish

Rights

info:eu-repo/semantics/openAccess

Abstract

The theme of this dissertation is geometric optimization and its applications. We study geometric proximity problems and several bioinformatics problems with a geometric content, requiring the use of geometric optimization tools. We have investigated the following type of proximity problems. Given a point set in a plane with n distinct points, for each point in the set find a pair of points from the remaining points in the set such that the three points either maximize or minimize some geometric measure defined on these. The measures include (a) sum and product; (b) difference; (c) line–distance; (d) triangle area; (e) triangle perimeter; (f) circumcircle–radius; and (g) triangle–distance in three dimensions. We have also studied the application of a linear time incremental geometric algorithm to test the linear separability of a set of blue points from a set of red points, in two and three–dimensional Euclidean spaces. We have used this geometric separability tool on 4 different gene expression data–sets, enumerating gene–pairs and gene–triplets that are linearly separable. Pushing on further, we have exploited this novel tool to identify some bio–marker genes for a classifier. The gene selection method proposed in the dissertation exhibits good classification accuracy as compared to other known feature (or gene) selection methods such as t–values, FCS (Fisher Criterion Score) and SAM (Significance Analysis of Microarrays). Continuing this line of investigation further, we have also designed an efficient algorithm to find the minimum number of outliers when the red and blue point sets are not fully linearly separable. We have also explored the applicability of geometric optimization techniques to the problem of protein structure similarity. We have come up with two new algorithms, EDAlignres and EDAlignsse, for pairwise protein structure alignment. EDAlignres identifies the best structural alignment of two equal length proteins by refining the correspondence obtained from eigendecomposition and to maximize the similarity measure for the refined correspondence. EDAlignsse, on the other hand, does not require the input proteins to be of equal length. These have been fully implemented and tested against well-established protein alignment programs

Share

COinS