AITopics | Learning in High Dimensional Spaces

Collaborating Authors

Learning in High Dimensional Spaces

High-dimensional spaces frequently occur in mathematics and the sciences. They may be parameter spaces or configuration spaces such as in Lagrangian or Hamiltonian mechanics; these are abstract spaces, independent of the physical space we live in. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Co-regularized Multi-view Sparse Reconstruction Embedding for Dimension Reduction

Wang, Huibing, Peng, Jinjia, Fu, Xianping

arXiv.org Artificial IntelligenceApr-1-2019

With the development of information technology, we have witnessed an age of data explosion which produces a large variety of data filled with redundant information. Because dimension reduction is an essential tool which embeds high-dimensional data into a lower-dimensional subspace to avoid redundant information, it has attracted interests from researchers all over the world. However, facing with features from multiple views, it's difficult for most dimension reduction methods to fully comprehended multi-view features and integrate compatible and complementary information from these features to construct low-dimensional subspace directly. Furthermore, most multi-view dimension reduction methods cannot handle features from nonlinear spaces with high dimensions. Therefore, how to construct a multi-view dimension reduction methods which can deal with multi-view features from high-dimensional nonlinear space is of vital importance but challenging. In order to address this problem, we proposed a novel method named Co-regularized Multi-view Sparse Reconstruction Embedding (CMSRE) in this paper. By exploiting correlations of sparse reconstruction from multiple views, CMSRE is able to learn local sparse structures of nonlinear manifolds from multiple views and constructs significative low-dimensional representations for them. Due to the proposed co-regularized scheme, correlations of sparse reconstructions from multiple views are preserved by CMSRE as much as possible. Furthermore, sparse representation produces more meaningful correlations between features from each single view, which helps CMSRE to gain better performances. Various evaluations based on the applications of document classification, face recognition and image retrieval can demonstrate the effectiveness of the proposed approach on multi-view dimension reduction.

artificial intelligence, low-dimensional representation, machine learning, (14 more...)

arXiv.org Artificial Intelligence

1904.08499

Country:

Asia > China > Liaoning Province > Dalian (0.04)
Europe > Portugal > Braga > Braga (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (1.00)

Add feedback

Dimension reduction as an optimization problem over a set of generalized functions

Takhanov, Rustem

arXiv.org Machine LearningMar-12-2019

Classical dimension reduction problem can be loosely formulated as a problem of finding a $k$-dimensional affine subspace of ${\mathbb R}^n$ onto which data points ${\mathbf x}_1,\cdots, {\mathbf x}_N$ can be projected without loss of valuable information. We reformulate this problem in the language of tempered distributions, i.e. as a problem of approximating an empirical probability density function $p_{\rm{emp}}({\mathbf x}) = \frac{1}{N} \sum_{i=1}^N \delta^n (\bold{x} - \bold{x}_i)$, where $\delta^n$ is an $n$-dimensional Dirac delta function, by another tempered distribution $q({\mathbf x})$ whose density is supported in some $k$-dimensional subspace. Thus, our problem is reduced to the minimization of a certain loss function $I(q)$ measuring the distance from $q$ to $p_{\rm{emp}}$ over a pertinent set of generalized functions, denoted $\mathcal{G}_k$. Another classical problem of data analysis is the sufficient dimension reduction problem. We show that it can be reduced to the following problem: given a function $f: {\mathbb R}^n\rightarrow {\mathbb R}$ and a probability density function $p({\mathbf x})$, find a function of the form $g({\mathbf w}^T_1{\mathbf x}, \cdots, {\mathbf w}^T_k{\mathbf x})$ that minimizes the loss ${\mathbb E}_{{\mathbf x}\sim p} |f({\mathbf x})-g({\mathbf w}^T_1{\mathbf x}, \cdots, {\mathbf w}^T_k{\mathbf x})|^2$. We first show that search spaces of the latter two problems are in one-to-one correspondence which is defined by the Fourier transform. We introduce a nonnegative penalty function $R(f)$ and a set of ordinary functions $\Omega_\epsilon = \{f| R(f)\leq \epsilon\}$ in such a way that $\Omega_\epsilon$ `approximates' the space $\mathcal{G}_k$ when $\epsilon \rightarrow 0$. Then we present an algorithm for minimization of $I(f)+\lambda R(f)$, based on the idea of two-step iterative computation.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1903.05083

Country:

North America > United States > New York (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
Asia > Kazakhstan > Akmola Region > Astana (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.82)

Add feedback

Exploring the Curse of Dimensionality - Part II. - Dr. Juan Camilo Orduz

#artificialintelligenceJan-1-2019, 15:28:36 GMT

I continue exploring the curse of dimensionality. Following the analysis form Part I., I want to discuss another consequence of sparse sampling in high dimensions: sample points are close to an edge of the sample. This post is based on The Elements of Statistical Learning, Section 2.5, which I encourage to read! Consider $N$ data points uniformly distributed in a $p$-dimensional unit ball centered at the origin. Suppose we consider a nearest-neighbor estimate at the origin.

artificial intelligence, geq alpha, machine learning, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.61)

Add feedback

Exploring the Curse of Dimensionality - Part I. - Dr. Juan Camilo Orduz

#artificialintelligenceJan-1-2019, 15:28:34 GMT

In this post I want to present the notion of curse of dimensionality following a suggested excercise (Chapter 4 - Ex. 4) of the book An Introduction to Statistical Learning, writen by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani. When the number of features $p$ is large, there tends to be a deterioration in the performance of KNN and other local approaches that perform prediction using only observations that are near the test observation for which a prediction must be made. This phenomenon is known as the curse of dimensionality, and it ties into the fact that non-parametric approaches often perform poorly when $p$ is large. We will now investigate this curse. Let us prepare the notebook.

artificial intelligence, juan camilo orduz, machine learning, (8 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.57)

Add feedback

Escaping the Curse of Dimensionality in Similarity Learning: Efficient Frank-Wolfe Algorithm and Generalization Bounds

Liu, Kuan, Bellet, Aurélien

arXiv.org Machine LearningOct-31-2018

High-dimensional and sparse data are commonly encountered in many applications of machine learning, such as computer vision, bioinformatics, text mining and behavioral targeting. To classify, cluster or rank data points, it is important to be able to compute semantically meaningful similarities between them. However, defining an appropriate similarity measure for a given task is often difficult as only a small and unknown subset of all features are actually relevant. For instance, in drug discovery studies, chemical compounds are typically represented by a large number of sparse features describing their 2D and 3D properties, and only a few of them play in role in determining whether the compound will bind to a particular target receptor (Leach and Gillet, 2007). In text classification and clustering, a document is often represented as a sparse bag of words, and only a small subset of the dictionary is generally useful to discriminate between documents about different topics. Another example is targeted advertising, where ads are selected based on fine-grained user history (Chen et al., 2009). Similarity and metric learning (Bellet et al., 2015) offers principled approaches to construct a taskspecific similarity measure by learning it from weakly supervised data, and has been used in many application domains. The main theme in these methods is to learn the parameters of a similarity (or distance) function such that it agrees with task-specific similarity judgments (e.g., of the form "data point x should

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

1807.07789

Country: North America > United States (1.00)

Genre: Research Report (0.63)

Industry:

Government > Military (0.93)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.64)
(2 more...)

Add feedback

The unreasonable effectiveness of small neural ensembles in high-dimensional brain

Gorban, A. N., Makarov, V. A., Tyukin, I. Y.

arXiv.org Artificial IntelligenceSep-20-2018

Despite the widely-spread consensus on the brain complexity, sprouts of the single neuron revolution emerged in neuroscience in the 1970s. They brought many unexpected discoveries, including grandmother or concept cells and sparse coding of information in the brain. In machine learning for a long time, the famous curse of dimensionality seemed to be an unsolvable problem. Nevertheless, the idea of the blessing of dimensionality becomes gradually more and more popular. Ensembles of non-interacting or weakly interacting simple units prove to be an effective tool for solving essentially multidimensional problems. This approach is especially useful for one-shot (non-iterative) correction of errors in large legacy artificial intelligence systems. These simplicity revolutions in the era of complexity have deep fundamental reasons grounded in geometry of multidimensional data spaces. To explore and understand these reasons we revisit the background ideas of statistical physics. In the course of the 20th century they were developed into the concentration of measure theory. New stochastic separation theorems reveal the fine structure of the data clouds. We review and analyse biological, physical, and mathematical problems at the core of the fundamental question: how can high-dimensional brain organise reliable and fast learning in high-dimensional world of data by simple tools? Two critical applications are reviewed to exemplify the approach: one-shot correction of errors in intellectual systems and emergence of static and associative memories in ensembles of single neurons.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

1809.07656

Country: Europe > United Kingdom > England (0.46)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(4 more...)

Add feedback

Fourier analysis perspective for sufficient dimension reduction problem

Takhanov, Rustem

arXiv.org Machine LearningAug-19-2018

A theory of sufficient dimension reduction (SDR) is developed from an optimizational perspective. In our formulation of the problem, instead of dealing with raw data, we assume that our ground truth includes a mapping ${\mathbf f}: {\mathbb R}^n\rightarrow {\mathbb R}^m$ and a probability distribution function $p$ over ${\mathbb R}^n$, both given analytically. We formulate SDR as a problem of finding a function ${\mathbf g}: {\mathbb R}^k\rightarrow {\mathbb R}^m$ and a matrix $P\in {\mathbb R}^{k\times n}$ such that ${\mathbb E}_{{\mathbf x}\sim p({\mathbf x})} \left|{\mathbf f}({\mathbf x}) - {\mathbf g}(P{\mathbf x})\right|^2$ is minimal. It turns out that the latter problem allows a reformulation in the dual space, i.e. instead of searching for ${\mathbf g}(P{\mathbf x})$ we suggest searching for its Fourier transform. First, we characterize all tempered distributions that can serve as the Fourier transform of such functions. The reformulation in the dual space can be interpreted as a problem of finding a $k$-dimensional linear subspace $S$ and a tempered distribution ${\mathbf t}$ supported in $S$ such that ${\mathbf t}$ is "close" in a certain sense to the Fourier transform of ${\mathbf f}$. Instead of optimizing over generalized functions with a $k$-dimensional support, we suggest minimizing over ordinary functions but with an additional term $R$ that penalizes a strong distortion of the support from any $k$-dimensional linear subspace. For a specific case of $R$, we develop an algorithm that can be formulated for functions given in the initial form as well as for their Fourier transforms. Eventually, we report results of numerical experiments with a discretized version of the latter algorithm.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1808.06191

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

High-dimensional estimation via sum-of-squares proofs

Raghavendra, Prasad, Schramm, Tselil, Steurer, David

arXiv.org Machine LearningJul-30-2018

Estimation is the computational task of recovering a hidden parameter $x$ associated with a distribution $D_x$, given a measurement $y$ sampled from the distribution. High dimensional estimation problems arise naturally in statistics, machine learning, and complexity theory. Many high dimensional estimation problems can be formulated as systems of polynomial equations and inequalities, and thus give rise to natural probability distributions over polynomial systems. Sum-of-squares proofs provide a powerful framework to reason about polynomial systems, and further there exist efficient algorithms to search for low-degree sum-of-squares proofs. Understanding and characterizing the power of sum-of-squares proofs for estimation problems has been a subject of intense study in recent years. On one hand, there is a growing body of work utilizing sum-of-squares proofs for recovering solutions to polynomial systems when the system is feasible. On the other hand, a general technique referred to as pseudocalibration has been developed towards showing lower bounds on the degree of sum-of-squares proofs. Finally, the existence of sum-of-squares refutations of a polynomial system has been shown to be intimately connected to the existence of spectral algorithms. In this article we survey these developments.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1807.11419

Country:

North America > United States > New York (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
(8 more...)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.80)

Add feedback

Overlapping Sliced Inverse Regression for Dimension Reduction

Zhang, Ning, Yu, Zhou, Wu, Qiang

arXiv.org Machine LearningJun-23-2018

Sliced inverse regression (SIR) is a pioneer tool for supervised dimension reduction. It identifies the effective dimension reduction space, the subspace of significant factors with intrinsic lower dimensionality. In this paper, we propose to refine the SIR algorithm through an overlapping slicing scheme. The new algorithm, called overlapping sliced inverse regression (OSIR), is able to estimate the effective dimension reduction space and determine the number of effective factors more accurately. We show that such overlapping procedure has the potential to identify the information contained in the derivatives of the inverse regression curve, which helps to explain the superiority of OSIR. We also prove that OSIR algorithm is $\sqrt n $-consistent and verify its effectiveness by simulations and real applications.

artificial intelligence, inverse regression, machine learning, (12 more...)

arXiv.org Machine Learning

1806.08911

Country:

North America > United States > Tennessee > Rutherford County > Murfreesboro (0.04)
North America > United States > Virginia > Alexandria County > Alexandria (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre:

Research Report (0.50)
Personal (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (1.00)

Add feedback

Privately Learning High-Dimensional Distributions

Kamath, Gautam, Li, Jerry, Singhal, Vikrant, Ullman, Jonathan

arXiv.org Machine LearningMay-1-2018

The sample complexity of both our algorithms approaches the sample complexity of non-private learners up to a small multiplicative factor and an additional additive term that is lower order for a wide range of parameters, showing that privacy comes essentially for free for these problems. Our algorithms use a novel technical approach to reducing the sensitivity of the estimation procedure that we call recursive private preconditioning and may find additional applications.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1805.00216

Country: North America > United States (0.46)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.40)

Add feedback