Indexed on: 08 Nov '16Published on: 08 Nov '16Published in: arXiv - Statistics - Methodology
Graphical models are ubiquitous tools to describe the interdependence between variables measured simultaneously such as large-scale gene or protein expression data. Gaussian graphical models (GGMs) are well-established tools for probabilistic exploration of dependence structures using precision matrices and they are generated under a multivariate normal joint distribution. However, they suffer from several shortcomings since they are based on Gaussian distribution assumptions. In this article, we propose a Bayesian quantile based approach for sparse estimation of graphs. We demonstrate that the resulting graph estimation is robust to outliers and applicable under general distributional assumptions. Furthermore, we develop efficient variational Bayes approximations to scale the methods for large data sets. Our methods are applied to a novel cancer proteomics data dataset wherein multiple proteomic antibodies are simultaneously assessed on tumor samples using reverse-phase protein arrays (RPPA) technology. The measured protein expressions have a wide dynamic range resulting in significant non-Gaussian behaviors. Our analysis reveals several biologically relevant protein-protein connections which are missed by existing approaches.