Indexed on: 06 Feb '14Published on: 06 Feb '14Published in: PloS one
It has become increasingly clear that the current taxonomy of clinical phenotypes is mixed with molecular heterogeneity, which potentially affects the treatment effect for involved patients. Defining the hidden molecular-distinct diseases using modern large-scale genomic approaches is therefore useful for refining clinical practice and improving intervention strategies. Given that microRNA expression profiling has provided a powerful way to dissect hidden genetic heterogeneity for complex diseases, the aim of the study was to develop a bioinformatics approach that identifies microRNA features leading to the hidden subtyping of complex clinical phenotypes. The basic strategy of the proposed method was to identify optimal miRNA clusters by iteratively partitioning the sample and feature space using the two-ways super-paramagnetic clustering technique. We evaluated the obtained optimal miRNA cluster by determining the consistency of co-expression and the chromosome location among the within-cluster microRNAs, and concluded that the optimal miRNA cluster could lead to a natural partition of disease samples. We applied the proposed method to a publicly available microarray dataset of breast cancer patients that have notoriously heterogeneous phenotypes. We obtained a feature subset of 13 microRNAs that could classify the 71 breast cancer patients into five subtypes with significantly different five-year overall survival rates (45%, 82.4%, 70.6%, 100% and 60% respectively; p = 0.008). By building a multivariate Cox proportional-hazards prediction model for the feature subset, we identified has-miR-146b as one of the most significant predictor (p = 0.045; hazard ratios = 0.39). The proposed algorithm is a promising computational strategy for dissecting hidden genetic heterogeneity for complex diseases, and will be of value for improving cancer diagnosis and treatment.