Indexed on: 21 Oct '05Published on: 21 Oct '05Published in: BioSystems
Plant promoters have not yet been thoroughly analyzed in terms of their structural and sequence dependent properties like curvature, periodicity and information content and our present study is an attempt in that direction. Results were compared with E. coli and yeast data to get some insight into the promoter organization. Promoters having the TATA box (TATA(+)) and those lacking the same (TATA(-)) were also analyzed separately. It was found that plant promoters have marked differences for all these properties when compared to E. coli and yeast. Bias for A+T was observed in promoters of all the three groups. Compared to E. coli and yeast, plant promoters showed intermediate values for A+T content as well as curvature. Analysis showed that curvature of core promoters is more pronounced than non-promoters. Information theoretic analysis of plant promoters reveal high information content at certain consensus regions such as -30 (TATA box) and +1 transcription start site (TSS); and have moderate values at other positions as well. This factor was taken into account while developing weight matrices. For certain threshold values, these weight matrices could pick up all true positives, and reduce false positives to a great extent in a test set. A new multi-parameterized prediction strategy has been proposed that uses a combination of sequence composition, curvature and position weight matrices for identification of plant promoters. This strategy was tested and validated with experimentally known promoter sequences. Our study is novel in using in silico approaches to study the sequence dependent properties of plant RNA Pol-II promoters and their prediction, and important as there is no dedicated promoter search tool for plants.