amin
Sinkhorn Based Associative Memory Retrieval Using Spherical Hellinger Kantorovich Dynamics
Mustafi, Aratrika, Mukherjee, Soumya
We propose a dense associative memory for empirical measures (weighted point clouds). Stored patterns and queries are finitely supported probability measures, and retrieval is defined by minimizing a Hopfield-style log-sum-exp energy built from the debiased Sinkhorn divergence. We derive retrieval dynamics as a spherical Hellinger Kantorovich (SHK) gradient flow, which updates both support locations and weights. Discretizing the flow yields a deterministic algorithm that uses Sinkhorn potentials to compute barycentric transport steps and a multiplicative simplex reweighting. Under local separation and PL-type conditions we prove basin invariance, geometric convergence to a local minimizer, and a bound showing the minimizer remains close to the corresponding stored pattern. Under a random pattern model, we further show that these Sinkhorn basins are disjoint with high probability, implying exponential capacity in the ambient dimension. Experiments on synthetic Gaussian point-cloud memories demonstrate robust recovery from perturbed queries versus a Euclidean Hopfield-type baseline.
- North America > United States > Pennsylvania (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
- Information Technology > Artificial Intelligence > Natural Language (0.67)
- Information Technology > Artificial Intelligence > Systems & Languages > Programming Languages (0.61)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.61)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Jersey > Middlesex County > New Brunswick (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Behind the scenes at #ICML2024
This year's International Conference on Machine Learning (ICML) took place in Vienna, Austria from 21-27 July 2024. The organisers introduced a new feature, in the form of "behind the scenes" chats with members of the conference committee. Hosted by Amin Karbasi, this series takes a look at how decisions are made at ICML, and other interesting AI-related topics. One day two, Amin spoke to co-workshop chair Andrew Gordon Wilson. Day three saw two chats.
Gradient Descent: Second Order Momentum and Saturating Error
Batch gradient descent, w(t) -7JdE/dw(t), conver es to a minimum of quadratic form with a time constant no better than '4Amax/ Amin where Amin and Amax are the minimum and maximum eigenvalues of the Hessian matrix of E with respect to w. It was recently shown that adding a momentum term w(t) -7JdE/dw(t) Q' w(t - 1) improves this to VAmax/ Amin, although only in the batch case. Here we show that second(cid:173) order momentum, w(t) -7JdE/dw(t) Q' w(t -1) (3 w(t - 2), can lower this no further. We then regard gradient descent with momentum as a dynamic system and explore a non quadratic error surface, showing that saturation of the error accounts for a variety of effects observed in simulations and justifies some popular heuristics.
Coronavirus doctor's diary: How gardening could help in the fight against obesity
Being overweight puts you at greater risk of serious illness or death from Covid-19, experts say - and now new anti-obesity strategies have been launched around the UK. In Bradford, community schemes to promote healthy lifestyles offers a novel approach to the problem. Dr John Wright of the city's Royal Infirmary explains why radical thinking is necessary. Our complete concentration on Covid-19 has concealed another global pandemic that has been more insidious but much more harmful: obesity. Early in the pandemic, we spotted common patterns in our sickest Covid-19 patients - they were more likely to have diabetes and heart disease and, in particular, to be obese.
- Europe > United Kingdom (0.16)
- North America > United States (0.05)
- Africa > Sub-Saharan Africa (0.05)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
- Health & Medicine > Therapeutic Area > Immunology (1.00)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.41)
What's next in IT automation: 6 trends to watch
We've recently covered the factors fueling IT automation, the current trends to watch as adoption grows, and helpful tips for those organizations just beginning to automate certain processes. Oh, and we also shared expert advice on how to make the case for automation in your company, as well as keys for long-term success. We asked a range of experts to share a peek into the not-so-distant future of automation. Here are six trends they advise IT leaders to monitor closely. Expect that to change, and for machine learning to play a significant role in the next waves of IT automation.
Information Theoretic Limits on Learning Stochastic Differential Equations
Bento, José, Ibrahimi, Morteza, Montanari, Andrea
Consider the problem of learning the drift coefficient of a stochastic differential equation from a sample path. In this paper, we assume that the drift is parametrized by a high dimensional vector. We address the question of how long the system needs to be observed in order to learn this vector of parameters. We prove a general lower bound on this time complexity by using a characterization of mutual information as time integral of conditional variance, due to Kadota, Zakai, and Ziv. This general lower bound is applied to specific classes of linear and non-linear stochastic differential equations. In the linear case, the problem under consideration is the one of learning a matrix of interaction coefficients. We evaluate our lower bound for ensembles of sparse and dense random matrices. The resulting estimates match the qualitative behavior of upper bounds achieved by computationally efficient procedures.
Concept-based Recommendations for Internet Advertisement
Ignatov, Dmitry I., Kuznetsov, Sergei O.
The problem of detecting terms that can be interesting to the advertiser is considered. If a company has already bought some advertising terms which describe certain services, it is reasonable to find out the terms bought by competing companies. A part of them can be recommended as future advertising terms to the company. The goal of this work is to propose better interpretable recommendations based on FCA and association rules.
- North America > United States > New York (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- (4 more...)
- Marketing (0.66)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.47)
Gradient Descent: Second Order Momentum and Saturating Error
We then regard gradient descent with momentum as a dynamic system and explore a non quadratic error surface, showing that saturation of the error accounts for a variety of effects observed in simulations and justifies some popular heuristics. 1 INTRODUCTION Gradient descent is the bread-and-butter optimization technique in neural networks. Some people build special purpose hardware to accelerate gradient descent optimization of backpropagation networks. Understanding the dynamics of gradient descent on such surfaces is therefore of great practical value. Here we briefly review the known results in the convergence of batch gradient descent; show that second-order momentum does not give any speedup; simulate a real network and observe some effect not predicted by theory; and account for these effects by analyzing gradient descent with momentum on a saturating error surface.
Gradient Descent: Second Order Momentum and Saturating Error
We then regard gradient descent with momentum as a dynamic system and explore a non quadratic error surface, showing that saturation of the error accounts for a variety of effects observed in simulations and justifies some popular heuristics. 1 INTRODUCTION Gradient descent is the bread-and-butter optimization technique in neural networks. Some people build special purpose hardware to accelerate gradient descent optimization of backpropagation networks. Understanding the dynamics of gradient descent on such surfaces is therefore of great practical value. Here we briefly review the known results in the convergence of batch gradient descent; show that second-order momentum does not give any speedup; simulate a real network and observe some effect not predicted by theory; and account for these effects by analyzing gradient descent with momentum on a saturating error surface.