Total Variation Classes Beyond 1d: Minimax Rates, and the Limitations of Linear Smoothers

Sadhanala, Veeranjaneyulu, Wang, Yu-Xiang, Tibshirani, Ryan J.

Neural Information Processing Systems 

We consider the problem of estimating a function defined over $n$ locations on a $d$-dimensional grid (having all side lengths equal to $n {1/d}$). Total variation denoising, also known as the fused lasso, is seen to be rate optimal. Several simpler estimators exist, such as Laplacian smoothing and Laplacian eigenmaps. A natural question is: can these simpler estimators perform just as well? We prove that these estimators, and more broadly all estimators given by linear transformations of the input data, are suboptimal over the class of functions with bounded variation.