Article quick-view

A novel pathway-based approach improves lung cancer risk prediction using germline genetic variations.

ABSTRACT

Although genome-wide association studies (GWAS) have identified many genetic variants that are strongly associated with lung cancer, these variants have low penetrance and serve as poor predictors of lung cancer in individuals. We sought to increase the predictive value of germline variants by considering their cumulative effects in the context of biologic pathways.For individuals in the Environment and Genetics in Lung Cancer Etiology study (1,815 cases/1,971 controls), we computed pathway-level susceptibility effects as the sum of relevant single-nucleotide polymorphism (SNP) variant alleles weighted by their log-additive effects from a separate lung cancer GWAS meta-analysis (7,766 cases/37,482 controls). Logistic regression models based on age, sex, smoking, genetic variants, and principal components of pathway effects and pathway-smoking interactions were trained and optimized in cross-validation, and further tested on an independent dataset (556 cases/830 controls). We assessed prediction performance using area under the receiver operating characteristic curve (AUC).Compared to typical binomial prediction models which have epidemiologic predictors (AUC = 0.607) in addition to top GWAS variants (AUC = 0.617), our pathway-based smoking-interactive multinomial model significantly improved prediction performance in external validation (AUC = 0.656, P < 0.0001).Our biologically informed approach demonstrated a larger increase in AUC over non-genetic counterpart models relative to previous approaches that incorporate variants.This model is the first of its kind to evaluate lung cancer prediction using subtype-stratified genetic effects organized into pathways and interacted with smoking. We propose pathway-exposure interactions as a potentially powerful new contributor to risk inference.