AITopics | Learning in High Dimensional Spaces

Collaborating Authors

Learning in High Dimensional Spaces

High-dimensional spaces frequently occur in mathematics and the sciences. They may be parameter spaces or configuration spaces such as in Lagrangian or Hamiltonian mechanics; these are abstract spaces, independent of the physical space we live in. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

What is the relationship between Curse of Dimensionality and isotropic neighborhoods?

#artificialintelligenceAug-9-2020, 19:55:33 GMT

The problem that Hastie, Tibshirani and Friedman are talking about here is that the number of fixed-size neighborhoods goes up exponentially with the dimension. If you're trying to get some intuition for how isotropic neighborhoods are affected by the curse of dimensionality, think about approximating ball-shaped (isotropic) neighborhoods with cube-shaped neighborhoods. Suppose we have an $d$-dimensional unit cube $[0, 1] d$ that we want to divide up into cube-shaped neighborhoods. If I want a neighborhood of side length $\delta 0.1$, in one dimension this requires $10 1 10$ neighborhoods. In two dimensions, this requires $10 2 100$ neighborhoods.

artificial intelligence, machine learning, neighborhood, (4 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.64)

Add feedback

The Lasso with general Gaussian designs with applications to hypothesis testing

Celentano, Michael, Montanari, Andrea, Wei, Yuting

arXiv.org Machine LearningJul-27-2020

The Lasso is a method for high-dimensional regression, which is now commonly used when the number of covariates $p$ is of the same order or larger than the number of observations $n$. Classical asymptotic normality theory is not applicable for this model due to two fundamental reasons: $(1)$ The regularized risk is non-smooth; $(2)$ The distance between the estimator $\bf \widehat{\theta}$ and the true parameters vector $\bf \theta^\star$ cannot be neglected. As a consequence, standard perturbative arguments that are the traditional basis for asymptotic normality fail. On the other hand, the Lasso estimator can be precisely characterized in the regime in which both $n$ and $p$ are large, while $n/p$ is of order one. This characterization was first obtained in the case of standard Gaussian designs, and subsequently generalized to other high-dimensional estimation procedures. Here we extend the same characterization to Gaussian correlated designs with non-singular covariance structure. This characterization is expressed in terms of a simpler ``fixed design'' model. We establish non-asymptotic bounds on the distance between distributions of various quantities in the two models, which hold uniformly over signals $\bf \theta^\star$ in a suitable sparsity class, and values of the regularization parameter. As applications, we study the distribution of the debiased Lasso, and show that a degrees-of-freedom correction is necessary for computing valid confidence intervals.

artificial intelligence, confidence interval, machine learning, (20 more...)

arXiv.org Machine Learning

2007.13716

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.45)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.34)

Add feedback

Overcoming the Curse of Dimensionality in Density Estimation with Mixed Sobolev GANs

Ding, Liang, Tuo, Rui, Shahrampour, Shahin

arXiv.org Machine LearningJun-5-2020

We propose a novel GAN framework for non-parametric density estimation with high-dimensional data. This framework is based on a novel density estimator, called the hyperbolic cross density estimator, which enjoys nice convergence properties in the mixed Sobolev spaces. As modifications of the usual Sobolev spaces, the mixed Sobolev spaces are more suitable for describing high-dimensional density functions. We prove that, unlike other existing approaches, the proposed GAN framework does not suffer the curse of dimensionality and can achieve the optimal convergence rate of $O_p(n^{-1/2})$, with $n$ data points in an arbitrary fixed dimension. We also study the universality of GANs in terms of the existence of ReLU networks which can approximate the density functions in the mixed Sobolev spaces up to any accuracy level.

artificial intelligence, estimator, machine learning, (17 more...)

arXiv.org Machine Learning

2006.03696

Country:

North America > United States > Texas > Brazos County > College Station (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.61)

Add feedback

Rdimtools: An R package for Dimension Reduction and Intrinsic Dimension Estimation

You, Kisung

arXiv.org Machine LearningMay-22-2020

Discovering patterns of the complex high-dimensional data is a long-standing problem. Dimension Reduction (DR) and Intrinsic Dimension Estimation (IDE) are two fundamental thematic programs that facilitate geometric understanding of the data. We present Rdimtools - an R package that supports 133 DR and 17 IDE algorithms whose extent makes multifaceted scrutiny of the data in one place easier. Rdimtools is distributed under the MIT license and is accessible from CRAN, GitHub, and its package website, all of which deliver instruction for installation, self-contained examples, and API documentation.

algorithm, artificial intelligence, machine learning, (13 more...)

arXiv.org Machine Learning

2005.11107

Country:

North America > United States > New York (0.04)
North America > United States > Indiana > St. Joseph County > Notre Dame (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Fractional norms and quasinorms do not help to overcome the curse of dimensionality

Mirkes, Evgeny M., Allohibi, Jeza, Gorban, Alexander N.

arXiv.org Machine LearningApr-29-2020

The curse of dimensionality causes the well-known and widely discussed problems for machine learning methods. There is a hypothesis that using of the Manhattan distance and even fractional quasinorms lp (for p less than 1) can help to overcome the curse of dimensionality in classification problems. In this study, we systematically test this hypothesis. We confirm that fractional quasinorms have a greater relative contrast or coefficient of variation than the Euclidean norm l2, but we also demonstrate that the distance concentration shows qualitatively the same behaviour for all tested norms and quasinorms and the difference between them decays as dimension tends to infinity. Estimation of classification quality for kNN based on different norms and quasinorms shows that a greater relative contrast does not mean better classifier performance and the worst performance for different databases was shown by different norms (quasinorms). A systematic comparison shows that the difference of the performance of kNN based on lp for p=2, 1, and 0.5 is statistically insignificant.

archive, database, dimensionality, (16 more...)

arXiv.org Machine Learning

doi: 10.3390/e22101105

2004.1423

Country:

Europe > Russia (0.14)
Asia > Russia (0.14)
Europe > United Kingdom > England > Leicestershire > Leicester (0.05)
(9 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.90)

Add feedback

Large-scale optimal transport map estimation using projection pursuit

Meng, Cheng, Ke, Yuan, Zhang, Jingyi, Zhang, Mengrui, Zhong, Wenxuan, Ma, Ping

Neural Information Processing SystemsMar-18-2020, 23:48:15 GMT

This paper studies the estimation of large-scale optimal transport maps (OTM), which is a well known challenging problem owing to the curse of dimensionality. Existing literature approximates the large-scale OTM by a series of one-dimensional OTM problems through iterative random projection. Such methods, however, suffer from slow or none convergence in practice due to the nature of randomly selected projection directions. Instead, we propose an estimation method of large-scale OTM by combining the idea of projection pursuit regression and sufficient dimension reduction. The proposed method, named projection pursuit Monge map (PPMM), adaptively selects the most informative'' projection direction in each iteration.

large-scale optimal transport map estimation, projection direction, projection pursuit, (4 more...)

Neural Information Processing Systems

Genre: Research Report (0.44)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.69)

Add feedback

Unsupervised Kernel Dimension Reduction

Wang, Meihong, Sha, Fei, Jordan, Michael I.

Neural Information Processing SystemsFeb-15-2020, 03:43:28 GMT

We apply the framework of kernel dimension reduction, originally designed for supervised problems, to unsupervised dimensionality reduction. In this framework, kernel-based measures of independence are used to derive low-dimensional representations that maximally capture information in covariates in order to predict responses. We extend this idea and develop similarly motivated measures for unsupervised problems where covariates and responses are the same. Our empirical studies show that the resulting compact representation yields meaningful and appealing visualization and clustering of data. Furthermore, when used in conjunction with supervised learners for classification, our methods lead to lower classification errors than state-of-the-art methods, especially when embedding data in spaces of very few dimensions.

unsupervised kernel dimension reduction

Neural Information Processing Systems

Genre: Research Report (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.67)

Add feedback

Feature-aware Label Space Dimension Reduction for Multi-label Classification

Chen, Yao-nan, Lin, Hsuan-tien

Neural Information Processing SystemsFeb-14-2020, 22:58:09 GMT

Label space dimension reduction (LSDR) is an efficient and effective paradigm for multi-label classification with many classes. Existing approaches to LSDR, such as compressive sensing and principal label space transformation, exploit only the label part of the dataset, but not the feature part. In this paper, we propose a novel approach to LSDR that considers both the label and the feature parts. The approach, called conditional principal label space transformation, is based on minimizing an upper bound of the popular Hamming loss. The minimization step of the approach can be carried out efficiently by a simple use of singular value decomposition.

feature-aware label space dimension reduction, multi-label classification, principal label space transformation, (3 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.66)

Add feedback

Curse of Dimensionality on Randomized Smoothing for Certifiable Robustness

Kumar, Aounon, Levine, Alexander, Goldstein, Tom, Feizi, Soheil

arXiv.org Machine LearningFeb-8-2020

Randomized smoothing, using just a simple isotropic Gaussian distribution, has been shown to produce good robustness guarantees against $\ell_2$-norm bounded adversaries. In this work, we show that extending the smoothing technique to defend against other attack models can be challenging, especially in the high-dimensional regime. In particular, for a vast class of i.i.d. smoothing distributions, we prove that the largest $\ell_p$-radius that can be certified decreases as $O(1/d^{\frac{1}{2} - \frac{1}{p}})$ with dimension $d$ for $p > 2$. Notably, for $p \geq 2$, this dependence on $d$ is no better than that of the $\ell_p$-radius that can be certified using isotropic Gaussian smoothing, essentially putting a matching lower bound on the robustness radius. When restricted to generalized Gaussian smoothing, these two bounds can be shown to be within a constant factor of each other in an asymptotic sense, establishing that Gaussian smoothing provides the best possible results, up to a constant factor, when $p \geq 2$. We present experimental results on CIFAR to validate our theory. For other smoothing distributions, such as, a uniform distribution within an $\ell_1$ or an $\ell_\infty$-norm ball, we show upper bounds of the form $O(1 / d)$ and $O(1 / d^{1 - \frac{1}{p}})$ respectively, which have an even worse dependence on $d$.

certificate, gaussian, randomized smoothing, (14 more...)

arXiv.org Machine Learning

2002.03239

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
(6 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.42)

Add feedback

Overcoming Mode Collapse and the Curse of Dimensionality

#artificialintelligenceFeb-7-2020, 18:52:00 GMT

Machine Learning Lecture at CMU by Ke Li, Ph.D. Candidate at the University of California, Berkeley Lecturer: Ke Li Carnegie Mellon University Abstract: In this talk, Li presents his team's work on overcoming two long-standing problems in machine learning and algorithms: 1. Mode collapse in generative adversarial nets (GANs) Generative adversarial nets (GANs) are perhaps the most popular class of generative models in use today. Unfortunately, they suffer from the well-documented problem of mode collapse, which the many successive variants of GANs have failed to overcome. I will illustrate why mode collapse happens fundamentally and show a simple way to overcome it, which is the basis of a new method known as Implicit Maximum Likelihood Estimation (IMLE). It turns out that this problem is not insurmountable - I will explain how the curse of dimensionality arises and show a simple way to overcome it, which gives rise to a new family of algorithms known as Dynamic Continuous Indexing (DCI). Bio: Ke Li is a recent Ph.D. graduate from UC Berkeley, where he was advised by Prof. Jitendra Malik, and will join Google as a Research Scientist and the Institute for Advanced Study (IAS) as a Member hosted by Prof. Sanjeev Arora.

algorithm, dimensionality, overcoming mode collapse, (5 more...)

#artificialintelligence

Country:

North America > United States > California > Alameda County > Berkeley (0.27)
North America > Canada > Ontario > Toronto (0.20)

Industry: Education (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.60)

Add feedback