The Dantzig selector and sparsity oracle inequalities

Research paper by Vladimir Koltchinskii

Indexed on: 04 Sep '09Published on: 04 Sep '09Published in: Mathematics - Statistics


Let \[Y_j=f_*(X_j)+\xi_j,\qquad j=1,...,n,\] where $X,X_1,...,X_n$ are i.i.d. random variables in a measurable space $(S,\mathcal{A})$ with distribution $\Pi$ and $\xi,\xi_1,... ,\xi_n$ are i.i.d. random variables with ${\mathbb{E}}\xi=0$ independent of $(X_1,...,X_n).$ Given a dictionary $h_1,...,h_N:S\mapsto{\mathbb{R}},$ let $f_{\lambda}:=\sum_{j=1}^N\lambda_jh_j$, $\lambda=(\lambda_1,...,\lambda_N)\in{\mathbb{R}}^N.$ Given $\varepsilon>0,$ define \[\hat{\Lambda}_{\varepsilon}:=\Biggl\{\lam bda\in{\mathbb{R}}^N:\max_{1\leq k\leq N}\Biggl|n^{-1}\sum_{j=1}^n\big l(f_{\lambda}(X_j)-Y_j\bigr)h_k(X_j)\Biggr|\leq\varepsilon \Biggr\}\] and \[\hat{\lambda}:=\hat{\lambda}^{\varepsilon}\in \operatorname {Arg min}\limits_{\lambda\in\hat{\Lambda}_{\varepsilon}}\|\lambda\|_{\ell_1}.\] In the case where $f_*:=f_{\lambda^*},\lambda^*\in {\mathbb{R}}^N,$ Candes and Tao [Ann. Statist. 35 (2007) 2313--2351] suggested using $\hat{\lambda}$ as an estimator of $\lambda^*.$ They called this estimator ``the Dantzig selector''. We study the properties of $f_{\hat{\lambda}}$ as an estimator of $f_*$ for regression models with random design, extending some of the results of Candes and Tao (and providing alternative proofs of these results).