Fourier analysis perspective for sufficient dimension reduction problem

Takhanov, Rustem

arXiv.org Machine Learning 

A theory of sufficient dimension reduction (SDR) is developed from an optimizational perspective. In our formulation of the problem, instead of dealing with raw data, we assume that our ground truth includes a mapping ${\mathbf f}: {\mathbb R}^n\rightarrow {\mathbb R}^m$ and a probability distribution function $p$ over ${\mathbb R}^n$, both given analytically. We formulate SDR as a problem of finding a function ${\mathbf g}: {\mathbb R}^k\rightarrow {\mathbb R}^m$ and a matrix $P\in {\mathbb R}^{k\times n}$ such that ${\mathbb E}_{{\mathbf x}\sim p({\mathbf x})} \left|{\mathbf f}({\mathbf x}) - {\mathbf g}(P{\mathbf x})\right|^2$ is minimal. It turns out that the latter problem allows a reformulation in the dual space, i.e. instead of searching for ${\mathbf g}(P{\mathbf x})$ we suggest searching for its Fourier transform. First, we characterize all tempered distributions that can serve as the Fourier transform of such functions. The reformulation in the dual space can be interpreted as a problem of finding a $k$-dimensional linear subspace $S$ and a tempered distribution ${\mathbf t}$ supported in $S$ such that ${\mathbf t}$ is "close" in a certain sense to the Fourier transform of ${\mathbf f}$. Instead of optimizing over generalized functions with a $k$-dimensional support, we suggest minimizing over ordinary functions but with an additional term $R$ that penalizes a strong distortion of the support from any $k$-dimensional linear subspace. For a specific case of $R$, we develop an algorithm that can be formulated for functions given in the initial form as well as for their Fourier transforms. Eventually, we report results of numerical experiments with a discretized version of the latter algorithm.