Rare variant association testing for next-generation sequencing data via hierarchical clustering.

Research paper by Ioanna I Tachmazidou, Andrew A Morris, Eleftheria E Zeggini

Indexed on: 01 Jan '12Published on: 01 Jan '12Published in: Human heredity


It is thought that a proportion of the genetic susceptibility to complex diseases is due to low-frequency and rare variants. Next-generation sequencing in large populations facilitates the detection of rare variant associations to disease risk. In order to achieve adequate power to detect association at low-frequency and rare variants, locus-specific statistical methods are being developed that combine information across variants within a functional unit and test for association with this enriched signal through so-called burden tests.We propose a hierarchical clustering approach and a similarity kernel-based association test for continuous phenotypes. This method clusters individuals into groups, within which samples are assumed to be genetically similar, and subsequently tests the group effects among the different clusters.The power of this approach is comparable to that of collapsing methods when causal variants have the same direction of effect, but its power is significantly higher compared to burden tests when both protective and risk variants are present in the region of interest. Overall, we observe that the Sequence Kernel Association Test (SKAT) is the most powerful approach under the allelic architectures considered.In our overall comparison, we find the analytical framework within which SKAT operates to yield higher power and to control type I error appropriately.