Indexed on: 31 Jan '14Published on: 31 Jan '14Published in: Statistics - Methodology
We introduce an algorithm which, in the context of nonlinear regression on vector-valued explanatory variables, chooses those combinations of vector components that provide best prediction. The algorithm devotes particular attention to components that might be of relatively little predictive value by themselves, and so might be ignored by more conventional methodology for model choice, but which, in combination with other difficult-to-find components, can be particularly beneficial for prediction. Additionally the algorithm avoids choosing vector components that become redundant once appropriate combinations of other, more relevant components are selected. It is suitable for very high dimensional problems, where it keeps computational labour in check by using a novel sequential argument, and also for more conventional prediction problems, where dimension is relatively low. We explore properties of the algorithm using both theoretical and numerical arguments.