Date of Award

December 2017

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Electrical Engineering and Computer Science

Advisor(s)

Kishan G. Mehrotra

Second Advisor

Chilukuri K. Mohan

Keywords

associative patterns, biological regulation, epigenetics, histone modification

Subject Categories

Engineering

Abstract

In the last decade, bioinformatics data has been accumulated at an unprecedented rate, thanks to the advancement in sequencing technologies. Such rapid development poses both challenges and promising research topics. In this dissertation, we propose a series of associative pattern recognition algorithms in biological regulation studies. In particular, we emphasize efficiently recognizing associative patterns between genes, transcription factors, histone modifications and functional labels using heterogeneous data sources (numeric, sequences, time series data and textual labels).

In protein-DNA associative pattern recognition, we introduce an efficient algorithm for affinity test by searching for over-represented DNA sequences using a hash function and modulo addition calculation. This substantially improves the efficiency of \textit{next generation sequencing} data analysis. In gene regulatory network inference, we propose a framework for refining weak networks based on transcription factor binding sites, thus improved the precision of predicted edges by up to 52%. In histone modification code analysis, we propose an approach to genome-wide combinatorial pattern recognition for "histone code to function" associative pattern recognition, and achieved improvement by up to $38.1\%$. We also propose a novel shape based modification pattern analysis approach, using this to successfully predict sub-classes of genes in flowering-time category. We also propose a "combination to combination" associative pattern recognition, and achieved better performance compared against multi-label classification and bidirectional associative memory methods. Our proposed approaches recognize associative patterns from different types of data efficiently, and provides a useful toolbox for biological regulation analysis. This dissertation presents a road-map to associative patterns recognition at genome wide level.

Access

Open Access

Included in

Engineering Commons

Share

COinS