A predictive model for secondary RNA structure using graph theory and a neural network.

Research paper by Denise R DR Koessler, Debra J DJ Knisley, Jeff J Knisley, Teresa T Haynes

Indexed on: 22 Oct '10Published on: 22 Oct '10Published in: BMC Bioinformatics


Determining the secondary structure of RNA from the primary structure is a challenging computational problem. A number of algorithms have been developed to predict the secondary structure from the primary structure. It is agreed that there is still room for improvement in each of these approaches. In this work we build a predictive model for secondary RNA structure using a graph-theoretic tree representation of secondary RNA structure. We model the bonding of two RNA secondary structures to form a larger secondary structure with a graph operation we call merge. We consider all combinatorial possibilities using all possible tree inputs, both those that are RNA-like in structure and those that are not. The resulting data from each tree merge operation is represented by a vector. We use these vectors as input values for a neural network and train the network to recognize a tree as RNA-like or not, based on the merge data vector. The network estimates the probability of a tree being RNA-like.The network correctly assigned a high probability of RNA-likeness to trees previously identified as RNA-like and a low probability of RNA-likeness to those classified as not RNA-like. We then used the neural network to predict the RNA-likeness of the unclassified trees.There are a number of secondary RNA structure prediction algorithms available online. These programs are based on finding the secondary structure with the lowest total free energy. In this work, we create a predictive tool for secondary RNA structures using graph-theoretic values as input for a neural network. The use of a graph operation to theoretically describe the bonding of secondary RNA is novel and is an entirely different approach to the prediction of secondary RNA structures. Our method correctly predicted trees to be RNA-like or not RNA-like for all known cases. In addition, our results convey a measure of likelihood that a tree is RNA-like or not RNA-like. Given that the majority of secondary RNA folding algorithms return more than one possible outcome, our method provides a means of determining the best or most likely structures among all of the possible outcomes.