AITopics

Sparse coding is a basic task in many fields including signal processing, neuroscience and machine learning where the goal is to learn a basis that enables a sparse representation of a given set of data, if one exists. Its standard formulation is as a non-convex optimization problem which is solved in practice by heuristics based on alternating minimization. Re- cent work has resulted in several algorithms for sparse coding with provable guarantees, but somewhat surprisingly these are outperformed by the simple alternating minimization heuristics. Here we give a general framework for understanding alternating minimization which we leverage to analyze existing heuristics and to design new ones also with provable guarantees. Some of these algorithms seem implementable on simple neural architectures, which was the original motivation of Olshausen and Field (1997a) in introducing sparse coding. We also give the first efficient algorithm for sparse coding that works almost up to the information theoretic limit for sparse recovery on incoherent dictionaries. All previous algorithms that approached or surpassed this limit run in time exponential in some natural parameter. Finally, our algorithms improve upon the sample complexity of existing approaches. We believe that our analysis framework will have applications in other settings where simple iterative algorithms are used.

algorithm, neural network, optimization problem, (20 more...)

1503.00778

Country:

North America > United States > Massachusetts (0.14)
North America > Canada > Quebec (0.14)
North America > Canada > British Columbia (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)

Yogatama, Dani, Smith, Noah A.

Bayesian Optimization of Text Representations

When applying machine learning to problems in NLP, there are many choices to make about how to represent input texts. These choices can have a big effect on performance, but they are often uninteresting to researchers or practitioners who simply need a module that performs well. We propose an approach to optimizing over this space of choices, formulating the problem as global optimization. We apply a sequential model-based optimization technique and show that our method makes standard linear models competitive with more sophisticated, expensive state-of-the-art methods based on latent variable models or neural networks on various topic classification and sentiment analysis problems. Our approach is a first step towards black-box NLP systems that work with raw text and do not require manual tuning.

dataset, neural network, optimization problem, (18 more...)

1503.00693

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > New Finding (0.72)

Industry: Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)

Carreira-Perpiñán, Miguel Á.

A review of mean-shift algorithms for clustering

A natural way to characterize the cluster structure of a dataset is by finding regions containing a high density of data. This can be done in a nonparametric way with a kernel density estimate, whose modes and hence clusters can be found using mean-shift algorithms. We describe the theory and practice behind clustering based on kernel density estimates and mean-shift algorithms. We discuss the blurring and non-blurring versions of mean-shift; theoretical results about mean-shift algorithms and Gaussian mixtures; relations with scale-space theory, spectral clustering and other algorithms; extensions to tracking, to manifold and graph data, and to manifold denoising; K-modes and Laplacian K-modes algorithms; acceleration strategies for large datasets; and applications to image segmentation, manifold denoising and multivalued regression.

algorithm, optimization problem, survey article, (21 more...)

1503.00687

Country:

Europe (1.00)
Asia (0.67)
North America > United States > California > Los Angeles County (0.14)
(2 more...)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.45)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Dorn, Sebastian, Enßlin, Torsten A., Greiner, Maksim, Selig, Marco, Boehm, Vanessa

Signal inference with unknown response: Calibration-uncertainty renormalized estimator

The calibration of a measurement device is crucial for every scientific experiment, where a signal has to be inferred from data. We present CURE, the calibration uncertainty renormalized estimator, to reconstruct a signal and simultaneously the instrument's calibration from the same data without knowing the exact calibration, but its covariance structure. The idea of CURE, developed in the framework of information field theory, is starting with an assumed calibration to successively include more and more portions of calibration uncertainty into the signal inference equations and to absorb the resulting corrections into renormalized signal (and calibration) solutions. Thereby, the signal inference and calibration problem turns into solving a single system of ordinary differential equations and can be identified with common resummation techniques used in field theories. We verify CURE by applying it to a simplistic toy example and compare it against existent self-calibration schemes, Wiener filter solutions, and Markov Chain Monte Carlo sampling. We conclude that the method is able to keep up in accuracy with the best self-calibration methods and serves as a non-iterative alternative to it.

artificial intelligence, calibration, machine learning, (15 more...)

doi: 10.1103/PhysRevE.91.013311

1410.6289

Country: Europe > Germany (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.48)

Pehlevan, Cengiz, Hu, Tao, Chklovskii, Dmitri B.

A Hebbian/Anti-Hebbian Neural Network for Linear Subspace Learning: A Derivation from Multidimensional Scaling of Streaming Data

Neural network models of early sensory processing typically reduce the dimensionality of streaming input data. Such networks learn the principal subspace, in the sense of principal component analysis (PCA), by adjusting synaptic weights according to activity-dependent learning rules. When derived from a principled cost function these rules are nonlocal and hence biologically implausible. At the same time, biologically plausible local rules have been postulated rather than derived from a principled cost function. Here, to bridge this gap, we derive a biologically plausible network for subspace learning on streaming data by minimizing a principled cost function. In a departure from previous work, where cost was quantified by the representation, or reconstruction, error, we adopt a multidimensional scaling (MDS) cost function for streaming data. The resulting algorithm relies only on biologically plausible Hebbian and anti-Hebbian local learning rules. In a stochastic setting, synaptic weights converge to a stationary state which projects the input data onto the principal subspace. If the data are generated by a nonstationary distribution, the network can track the principal subspace. Thus, our result makes a step towards an algorithmic theory of neural computation.

algorithm, artificial intelligence, neural network, (18 more...)

doi: 10.1162/NECO_a_00745

1503.00669

Country:

North America > United States > Wisconsin (0.14)
North America > United States > Texas (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Education (0.48)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Pehlevan, Cengiz, Chklovskii, Dmitri B.

A Hebbian/Anti-Hebbian Network Derived from Online Non-Negative Matrix Factorization Can Cluster and Discover Sparse Features

Despite our extensive knowledge of biophysical properties of neurons, there is no commonly accepted algorithmic theory of neuronal function. Here we explore the hypothesis that single-layer neuronal networks perform online symmetric nonnegative matrix factorization (SNMF) of the similarity matrix of the streamed data. By starting with the SNMF cost function we derive an online algorithm, which can be implemented by a biologically plausible network with local learning rules. We demonstrate that such network performs soft clustering of the data as well as sparse feature discovery. The derived algorithm replicates many known aspects of sensory anatomy and biophysical properties of neurons including unipolar nature of neuronal activity and synaptic weights, local synaptic plasticity rules and the dependence of learning rate on cumulative neuronal activity. Thus, we make a step towards an algorithmic theory of neuronal function, which should facilitate large-scale neural circuit simulations and biologically inspired artificial intelligence.

algorithm, health & medicine, neural network, (14 more...)

doi: 10.1109/ACSSC.2014.7094553

1503.0068

Country: North America > United States (0.14)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

Marcolino, Leandro Soriano (University of Southern California) | Gerber, David (University of Southern California) | Kolev, Boian (California State University, Dominguez Hills) | Price, Samori (California State University, Dominguez Hills) | Pantazis, Evangelos (University of Southern California) | Tian, Ye (University of Southern California) | Tambe, Milind (University of Southern California)

Agents Vote for the Environment: Designing Energy-Efficient Architecture

AAAI ConferencesMar-1-2015

Saving energy is a major concern. Hence, it is fundamental to design and construct buildings that are energy-efficient. It is known that the early stage of architectural design has a significant impact on this matter. However, it is complex to create designs that are optimally energy efficient, and at the same time balance other essential criterias such as economics, space, and safety. One state-of-the art approach is to create parametric designs, and use a genetic algorithm to optimize across different objectives. We further improve this method, by aggregating the solutions of multiple agents. We evaluate diverse teams, composed by different agents; and uniform teams, composed by multiple copies of a single agent. We test our approach across three design cases of increasing complexity, and show that the diverse team provides a significantly larger percentage of optimal solutions than single agents.

energy conservation, optimal solution, optimization problem, (19 more...)

AAAI Conferences

Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence

Country: North America > United States > California (0.48)

Industry:

Energy > Energy Conservation & Efficiency (1.00)
Construction & Engineering (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Lesieur, Thibault, Krzakala, Florent, Zdeborova, Lenka

Phase Transitions in Sparse PCA

arXiv.org Machine LearningMar-1-2015

We study optimal estimation for sparse principal component analysis when the number of non-zero elements is small but on the same order as the dimension of the data. We employ approximate message passing (AMP) algorithm and its state evolution to analyze what is the information theoretically minimal mean-squared error and the one achieved by AMP in the limit of large sizes. For a special case of rank one and large enough density of non-zeros Deshpande and Montanari [1] proved that AMP is asymptotically optimal. We show that both for low density and for large rank the problem undergoes a series of phase transitions suggesting existence of a region of parameters where estimation is information theoretically possible, but AMP (and presumably every other polynomial algorithm) fails. The analysis of the large rank limit is particularly instructive.

artificial intelligence, machine learning, state evolution, (14 more...)

doi: 10.1109/ISIT.2015.7282733

1503.00338

Country:

Europe (0.14)
North America > United States (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.35)

arXiv.org Machine LearningMar-1-2015

Block stochastic gradient iteration for convex and nonconvex optimization

Xu, Yangyang, Yin, Wotao

The stochastic gradient (SG) method can minimize an objective function composed of a large number of differentiable functions, or solve a stochastic optimization problem, to a moderate accuracy. The block coordinate descent/update (BCD) method, on the other hand, handles problems with multiple blocks of variables by updating them one at a time; when the blocks of variables are easier to update individually than together, BCD has a lower per-iteration cost. This paper introduces a method that combines the features of SG and BCD for problems with many components in the objective and with multiple (blocks of) variables. Specifically, a block stochastic gradient (BSG) method is proposed for solving both convex and nonconvex programs. At each iteration, BSG approximates the gradient of the differentiable part of the objective by randomly sampling a small set of data or sampling a few functions from the sum term in the objective, and then, using those samples, it updates all the blocks of variables in either a deterministic or a randomly shuffled order. Its convergence for both convex and nonconvex cases are established in different senses. In the convex case, the proposed method has the same order of convergence rate as the SG method. In the nonconvex case, its convergence is established in terms of the expected violation of a first-order optimality condition. The proposed method was numerically tested on problems including stochastic least squares and logistic regression, which are convex, as well as low-rank tensor recovery and bilinear logistic regression, which are nonconvex.

algorithm, artificial intelligence, machine learning, (16 more...)

1408.2597

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

arXiv.org Machine LearningMar-1-2015

Constructive sparse trigonometric approximation for functions with small mixed smoothness

Temlyakov, V. N.

The paper gives a constructive method, based on greedy algorithms, that provides for the classes of functions with small mixed smoothness the best possible in the sense of order approximation error for the $m$-term approximation with respect to the trigonometric system.

approximation, artificial intelligence, trigonometric approximation, (15 more...)

1503.00282

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.35)