Abstract (eng)
A Phylogenetic Definition of Structure
What is a structure ?
The present thesis poses this question in the field of RNA molecular biology. While doing so, the aim is to contribute to the understanding of the intertwined relationship between structure, substitution process and evolutionary history. The thesis starts with an introduction into two fields: RNA & phylogeny , followed by the research chapters.
SISSI’s Simulacrum, a framework for SImulating Site-Specific Interactions along phylogenetic trees, mimics sequence evolution under structural constraints in a unifying framework including arbitrary complex models of sequence evolution. This feeds into:
A Phylogenetic Definition of Structure, which consists of three aspects: The substitution matrix, a neighbourhood system and the phylogenetic tree. The substitution matrix specifies the evolutionary process of nucleotide evolution. However, the matrix is influenced by the neighbourhood system that defines the interactions among sites in a sequence. The phylogenetic tree introduces an additional dependency pattern in the observed sequences. In this chapter the general ideas of a Phylogenetic Structure (PS) are illustrated with examples. Consequently, this thesis focusses on particular approaches, devoting one chapter to each of the three aspects of a PS.
MATA’s Neighbourhood System Aspect is considered in the context of so-called consensus structure from an alignment. Using the parametric bootstrap, MATA, Measurement of Accurate Thresholds of Alignments, enables the detection of functionally associated correlations from a sequence alignment incorporating the phylogeny of the sequences combined with an automatic threshold procedure.
SISSIz’s Substitution Model Aspect is illustrated in the field of non-coding RNAs. We build up the SISSI framework to directly combine a new null model, based on a complex substitution model, with a consensus folding algorithm resulting in a new variant of a thermodynamic structure-based RNA gene finding program that is not biased by the dinucleotide content.
OSM ’s Phylogenetic Tree Aspect introduces another view on sequence evolution. The One Step Mutation Matrix encodes the phylogenetic tree directly and leads to analytical formulae for the posterior probability distribution of the number of substitutions for an alignment column. So far, our phylogenetic definition of structure has specified the evolutionary process of nucleotide evolution with site-specific interactions. Here, the definition is discussed as a description in pattern space.
The outlook discuss the (in)completness of a phylogenetic definition of structure. However, the three approaches to each aspect including SISSI provide a very promising possibility to unite all three aspects of a PS together. Finally, the thesis concludes with a description of combining these methods towards structure evolution, which revers back to the original question: What is a structure?