An adaptive analysis of covariance using tree-structured regression

Research paper by G. L. Gadbury, H. K. Iyer, H. T. Schreuder

Indexed on: 01 Mar '02Published on: 01 Mar '02Published in: Journal of Agricultural, Biological, and Environmental Statistics


In this article, we propose an adaptive procedure for testing for the effect of a factor of interest in the presence of one or more confounding variables in observational studies. It is especially relevant for applications where the factor of interest has a suspected causal relationship with a response. This procedure is not tied to linear modeling or normal distribution theory, and it offers a valuable alternative to traditional methods. It is suitable for applications where a factor of interest is categorical and the response is continuous. Confounding variables may be continuous or categorical. The method is comprised of two basic steps that are performed in sequence. First, confounding variables alone (i.e., without the factor of interest) are used to group observations into subsets. These subsets have the property that, when restricted to a subset, there is little or no remaining variation in the response that is attributable to the confounding variables. We then test for the factor of interest within each subset of observations. We propose to implement the first step using a technique that is generally referred to as tree-structured regression. We use a non parametric permutation procedure to carry out the second step. The proposed method is illustrated through an analysis of a U. S. Department of Agriculture (USDA) Forest Service data set and an air pollution data set.