Indexed on: 07 Mar '06Published on: 07 Mar '06Published in: BMC Bioinformatics
It has become increasingly apparent that a comprehensive database of RNA motifs is essential in order to achieve new goals in genomic and proteomic research. Secondary RNA structures have frequently been represented by various modeling methods as graph-theoretic trees. Using graph theory as a modeling tool allows the vast resources of graphical invariants to be utilized to numerically identify secondary RNA motifs. The domination number of a graph is a graphical invariant that is sensitive to even a slight change in the structure of a tree. The invariants selected in this study are variations of the domination number of a graph. These graphical invariants are partitioned into two classes, and we define two parameters based on each of these classes. These parameters are calculated for all small order trees and a statistical analysis of the resulting data is conducted to determine if the values of these parameters can be utilized to identify which trees of orders seven and eight are RNA-like in structure.The statistical analysis shows that the domination based parameters correctly distinguish between the trees that represent native structures and those that are not likely candidates to represent RNA. Some of the trees previously identified as candidate structures are found to be "very" RNA like, while others are not, thereby refining the space of structures likely to be found as representing secondary RNA structure.Search algorithms are available that mine nucleotide sequence databases. However, the number of motifs identified can be quite large, making a further search for similar motif computationally difficult. Much of the work in the bioinformatics arena is toward the development of better algorithms to address the computational problem. This work, on the other hand, uses mathematical descriptors to more clearly characterize the RNA motifs and thereby reduce the corresponding search space. These preliminary findings demonstrate that graph-theoretic quantifiers utilized in fields such as computer network design hold significant promise as an added tool for genomics and proteomics.