G

Garvesh Raskutti

Statistical and Applied Mathematical Sciences Institute

ORCID: 0000-0002-0522-5423

Publishes on Statistical Methods and Inference, Sparse and Compressive Sensing Techniques, Bayesian Modeling and Causal Inference. 100 papers and 3.8k citations.

100Publications
3.8kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence
Pradeep Ravikumar, Martin J. Wainwright, Garvesh Raskutti et al.|Electronic Journal of Statistics|2011
Cited by 791Open Access

Given i.i.d. observations of a random vector X∈ℝp, we study the problem of estimating both its covariance matrix Σ*, and its inverse covariance or concentration matrix Θ*=(Σ*)−1. When X is multivariate Gaussian, the non-zero structure of Θ* is specified by the graph of an associated Gaussian Markov random field; and a popular estimator for such sparse Θ* is the ℓ1-regularized Gaussian MLE. This estimator is sensible even for for non-Gaussian X, since it corresponds to minimizing an ℓ1-penalized log-determinant Bregman divergence. We analyze its performance under high-dimensional scaling, in which the number of nodes in the graph p, the number of edges s, and the maximum node degree d, are allowed to grow as a function of the sample size n. In addition to the parameters (p,s,d), our analysis identifies other key quantities that control rates: (a) the ℓ∞-operator norm of the true covariance matrix Σ*; and (b) the ℓ∞ operator norm of the sub-matrix Γ*SS, where S indexes the graph edges, and Γ*=(Θ*)−1⊗(Θ*)−1; and (c) a mutual incoherence or irrepresentability measure on the matrix Γ* and (d) the rate of decay 1/f(n,δ) on the probabilities {|Σ̂nij−Σ*ij|>δ}, where Σ̂n is the sample covariance based on n samples. Our first result establishes consistency of our estimate Θ̂ in the elementwise maximum-norm. This in turn allows us to derive convergence rates in Frobenius and spectral norms, with improvements upon existing results for graphs with maximum node degrees $d=o(\sqrt{s})$. In our second result, we show that with probability converging to one, the estimate Θ̂ correctly specifies the zero pattern of the concentration matrix Θ*. We illustrate our theoretical results via simulations for various graphs and problem parameters, showing good correspondences between the theoretical predictions and behavior in simulations.

Minimax Rates of Estimation for High-Dimensional Linear Regression Over $\ell_q$-Balls
Garvesh Raskutti, Martin J. Wainwright, Bin Yu|IEEE Transactions on Information Theory|2011
Cited by 485

Consider the high-dimensional linear regression model <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">y</i> = <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">X</i> β <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">*</sup> + <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">w</i> , where <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">y</i> ∈ \BBR <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</i> is an observation vector, <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">X</i> ∈ \BBR <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</i> × <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">d</i> is a design matrix with <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">d</i> >; <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</i> , β <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">*</sup> ∈ \BBR <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">d</i> is an unknown regression vector, and <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">w</i> ~ <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">N</i> (0, σ <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">I</i> ) is additive Gaussian noise. This paper studies the minimax rates of convergence for estimating β <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">*</sup> in either <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">l</i> <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> -loss and <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">l</i> <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> -prediction loss, assuming that β <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">*</sup> belongs to an <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">lq</i> -ball \BBB <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">q</i> ( <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Rq</i> ) for some <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">q</i> ∈ [0,1]. It is shown that under suitable regularity conditions on the design matrix <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">X</i> , the minimax optimal rate in <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">l</i> <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> -loss and <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">l</i> <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> -prediction loss scales as Θ( <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Rq</i> ([(log <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">d</i> )/( <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</i> )]) <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1-q</sup> / <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> ). The analysis in this paper reveals that conditions on the design matrix <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">X</i> enter into the rates for <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">l</i> <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> -error and <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">l</i> <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> -prediction error in complementary ways in the upper and lower bounds. Our proofs of the lower bounds are information theoretic in nature, based on Fano's inequality and results on the metric entropy of the balls \BBB <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">q</i> ( <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Rq</i> ), whereas our proofs of the upper bounds are constructive, involving direct analysis of least squares over <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">lq</i> -balls. For the special case <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">q</i> =0, corresponding to models with an exact sparsity constraint, our results show that although computationally efficient <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">l</i> <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> -based methods can achieve the minimax rates up to constant factors, they require slightly stronger assumptions on the design matrix <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">X</i> than optimal algorithms involving least-squares over the <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">l</i> <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> -ball.

Restricted Eigenvalue Properties for Correlated Gaussian Designs
Cited by 405

Methods based onℓ1-relaxation, such as basis pursuit and the Lasso, are very popular for sparse regression in high dimensions. The conditions for success of these methods are now well-understood: (1) exact recovery in the noiseless setting is possible if and only if the design matrix X satisfies the restricted nullspace property, and (2) the squaredℓ2-error of a Lasso estimate decays at the minimax k log p n optimal rate, where k is the sparsity of the p-dimensional regression problem with additive Gaussian noise, whenever the design satisfies a restricted eigenvalue condition. The key issue is thus to determine when the design matrix X satisfies these desirable properties. Thus far, there have been numerous results showing that the restricted isometry property, which implies both the restricted nullspace and eigenvalue conditions, is satisfied when all entries of X are independent and identically distributed (i.i.d.), or the rows are unitary. This paper proves directly that the restricted nullspace and eigenvalue conditions hold with high probability for quite general classes of Gaussian matrices for which the predictors may be highly dependent, and hence restricted isometry conditions can be violated with high probability. In this way, our results extend the attractive theoretical guarantees onℓ1-relaxations to a much broader class of problems than the case of completely independent or unitary designs.

The Stealth Media? Groups and Targets behind Divisive Issue Campaigns on Facebook
Young Mie Kim, Jordan Hsu, David Neiman et al.|Political Communication|2018
Cited by 266

In light of the foreign interference in the 2016 U.S. elections, the present research asks the question of whether the digital media has become the stealth media for anonymous political campaigns. By utilizing a user-based, real-time, digital ad tracking tool, the present research reverse engineers and tracks the groups (Study 1) and the targets (Study 2) of divisive issue campaigns based on 5 million paid ads on Facebook exposed to 9,519 individuals between September 28, 2016, and November 8, 2016. The findings reveal groups that did not file reports to the Federal Election Commission (FEC)—nonprofits, astroturf/movement groups, and unidentifiable “suspicious” groups, including foreign entities—ran most of the divisive issue campaigns. One out of six suspicious groups later turned out to be Russian groups. The volume of ads sponsored by non-FEC groups was 4 times larger than that of FEC groups. Divisive issue campaigns clearly targeted battleground states, including Pennsylvania and Wisconsin where traditional Democratic strongholds supported Donald Trump by a razor-thin margin. The present research asserts that media ecology, the technological features and capacity of digital media, as well as regulatory loopholes created by Citizens United v. FEC and the FEC’s disclaimer exemption for digital platforms contribute to the prevalence of anonymous groups’ divisive issue campaigns on digital media. The present research offers insight relevant for regulatory policy discussion and discusses the normative implications of the findings for the functioning of democracy.

Minimax-optimal rates for sparse additive models over kernel classes via convex programming
Cited by 220

Sparse additive models are families of d-variate functions with the additive decomposition f ∗ = ∑ j∈S f ∗ j, where S is an unknown subset of cardinality s ≪ d. In this paper, we consider the case where each univariate component function f ∗ j lies in a reproducing kernel Hilbert space (RKHS), and analyze a method for estimating the unknown function f ∗ based on kernels combined with ℓ1-type convex regularization. Working within a high-dimensional framework that allows both the dimension d and sparsity s to increase with n, we derive convergence rates in the L2 (P) and L2 (Pn) norms over the classF d,s,H of sparse additive models with each univariate function f ∗ j in the unit ball of a univariate RKHS with bounded kernel function. We complement our upper bounds by deriving minimax lower bounds on the L2 (P) error, thereby showing the optimality of our method. Thus, we obtain optimal minimax rates for many interesting classes of sparse additive models, including polynomials, splines, and Sobolev classes. We also show that if, in contrast to our univariate conditions, the d-variate function class is assumed to be globally bounded, then much faster estimation rates are possible for any sparsity s=Ω ( √ n), showing that global boundedness is a significant restriction in the high-dimensional setting.