Indexed on: 06 Apr '18Published on: 06 Apr '18Published in: arXiv - Mathematics - Statistics
High-dimensional linear regression with interaction effects is broadly applied in research fields such as bioinformatics and social science. In this paper, we first investigate the minimax rate of convergence for regression estimation in high-dimensional sparse linear models with two-way interactions. We derive matching upper and lower bounds under three types of heredity conditions: strong heredity, weak heredity and no heredity. From the results: (i) A stronger heredity condition may or may not drastically improve the minimax rate of convergence. In fact, in some situations, the minimax rates of convergence are the same under all three heredity conditions; (ii) The minimax rate of convergence is determined by the maximum of the total price of estimating the main effects and that of estimating the interaction effects, which goes beyond purely comparing the order of the number of non-zero main effects $r_1$ and non-zero interaction effects $r_2$; (iii) Under any of the three heredity conditions, the estimation of the interaction terms may be the dominant part in determining the rate of convergence for two different reasons: 1) there exist more interaction terms than main effect terms or 2) a large ambient dimension makes it more challenging to estimate even a small number of interaction terms. Second, we construct an adaptive estimator that achieves the minimax rate of convergence regardless of the true heredity condition and the sparsity indices $r_1, r_2$.