AITopics | logm

Collaborating Authors

logm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Overfitting and Generalizing with (PAC) Bayesian Prediction in Noisy Binary Classification

Zhu, Xiaohan, Ohannessian, Mesrob I., Srebro, Nathan

arXiv.org Machine LearningMar-25-2026

We consider a PAC-Bayes type learning rule for binary classification, balancing the training error of a randomized ''posterior'' predictor with its KL divergence to a pre-specified ''prior''. This can be seen as an extension of a modified two-part-code Minimum Description Length (MDL) learning rule, to continuous priors and randomized predictions. With a balancing parameter of $λ=1$ this learning rule recovers an (empirical) Bayes posterior and a modified variant recovers the profile posterior, linking with standard Bayesian prediction (up to the treatment of the single-parameter noise level). However, from a risk-minimization prediction perspective, this Bayesian predictor overfits and can lead to non-vanishing excess loss in the agnostic case. Instead a choice of $λ\gg 1$, which can be seen as using a sample-size-dependent-prior, ensures uniformly vanishing excess loss even in the agnostic case. We precisely characterize the effect of under-regularizing (and over-regularizing) as a function of the balance parameter $λ$, understanding the regimes in which this under-regularization is tempered or catastrophic. This work extends previous work by Zhu and Srebro [2025] that considered only discrete priors to PAC Bayes type learning rules and, through their rigorous Bayesian interpretation, to Bayesian prediction more generally.

artificial intelligence, log 1, machine learning, (19 more...)

arXiv.org Machine Learning

2603.22644

Country: North America > United States (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Sinkhorn Based Associative Memory Retrieval Using Spherical Hellinger Kantorovich Dynamics

Mustafi, Aratrika, Mukherjee, Soumya

arXiv.org Machine LearningMar-24-2026

We propose a dense associative memory for empirical measures (weighted point clouds). Stored patterns and queries are finitely supported probability measures, and retrieval is defined by minimizing a Hopfield-style log-sum-exp energy built from the debiased Sinkhorn divergence. We derive retrieval dynamics as a spherical Hellinger Kantorovich (SHK) gradient flow, which updates both support locations and weights. Discretizing the flow yields a deterministic algorithm that uses Sinkhorn potentials to compute barycentric transport steps and a multiplicative simplex reweighting. Under local separation and PL-type conditions we prove basin invariance, geometric convergence to a local minimizer, and a bound showing the minimizer remains close to the corresponding stored pattern. Under a random pattern model, we further show that these Sinkhorn basins are disjoint with high probability, implying exponential capacity in the ambient dimension. Experiments on synthetic Gaussian point-cloud memories demonstrate robust recovery from perturbed queries versus a Euclidean Hopfield-type baseline.

amin, machine learning, natural language, (15 more...)

arXiv.org Machine Learning

2603.20656

Country:

North America > United States > Pennsylvania (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language (0.67)
Information Technology > Artificial Intelligence > Systems & Languages > Programming Languages (0.61)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.61)

Add feedback

OT4P: UnlockingEffectiveOrthogonalGroupPathfor PermutationRelaxation

Neural Information Processing SystemsFeb-18-2026, 06:42:24 GMT

This transformation naturally implements a parameterization for the relaxation of permutation matrices, allowing for gradient-based optimization of problems involving permutations.

artificial intelligence, machine learning, optimization problem, (17 more...)

Neural Information Processing Systems

Country: Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

LearningSocialWelfareFunctions

Neural Information Processing SystemsFeb-12-2026, 11:46:45 GMT

Consider a standard decision making setting that includes a set of possible actions (decisions or policies), and a set of individuals who assign utilities to the actions.

artificial intelligence, logm, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.46)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

943d6dca1884955e645d8997ae2fa938-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 20:28:41 GMT

For this reason, it is important to design algorithms that are able to maintain a stable and high quality solution and that at the same time can process updates efficiently.

algorithm, artificial intelligence, niceclustering, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.47)

Add feedback

SupplementaryMaterial MatrixCompletionwithHierarchical GraphSideInformation

Neural Information Processing SystemsFeb-8-2026, 17:25:57 GMT

This implies that M(δ) = T(δ), i.e., the constraint(13) made in T(δ) does not lose any generality in matrix representation. One technical distinction relative to the previous works [2,3] arises from the fact that in our setting, the hamming distances(dx1(`),dx2(`),dx3(`)) defined w.r.t. We focus on the family of rating matrices{Mhci: c T`}. First, we present the following lemma that guarantees the existence of two subsets of users with certainproperties. The proof of this case follows the same structure as that of the grouping-limited regime. It is shown that the groups within each cluster are recovered with a vanishing fraction of errors if Ig = ω(1/n).

artificial intelligence, log 1, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.66)
Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

44d20d542f3f4e3b7097e5e3f78f99f1-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 15:44:27 GMT

Generative networks have experienced great empirical successes in distribution learning.

artificial intelligence, arxivpreprintarxiv, machine learning, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

211ab571cc9f3802afa6ffff52ae3e5b-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 21:04:39 GMT

In addition, the underlying signalxisassumed to lie in the range of anL-Lipschitz continuous generativemodel with boundedkdimensionalinputs.Weproposeatwo-stepapproach,forwhichthefirststepplays the role ofspectral initialization and the second step refines the estimated vector produced by the first step iteratively. We show that both steps enjoy a statistical rate oforder p (klogL) (logm)/mundersuitable conditions.

artificial intelligence, conf, machine learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Supplementary File for " Stochastic Gradient Descent in Correlated Settings: AStudy on Gaussian Processes "

Neural Information Processing SystemsFeb-7-2026, 16:55:32 GMT

The supplementary file is organized as follows: Section 1 restates the assumptions and main theorems on the convergence of parameter iterates and the full gradient; Section 2 is devoted to the proofs of the two main theorems, while Section 3 includes the proofs of supporting lemmas; Section 4 includes additional figures from the numerical study. Under Assumptions 1.1 to 1.3, when m > C for some constant C > 0, we have the following results under two corresponding conditions on sl(m): First we present the following lemma, showing that the loss function has a property similar from strong convexity. For the first case discussed in Lemma 2.1, define eg(θ(k)) = (g(θ(k)))2, and for the second case define eg(θ(k)) = g(θ(k)). Therefore, combining Lemma 2.1, Lemma 2.2 and (7) leads to the following conclusion. Apply(15)inLemma 2.3 with = 12, then for any 0<α<1, with probability at least 1 2m α, we have A11 1 Under this case, we can still apply (15) in Lemma 2.3.

artificial intelligence, machine learning, probability, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.86)

Add feedback

Supplementaryfor NeuralMethodsforPoint-wiseDependencyEstimation

Neural Information Processing SystemsFeb-7-2026, 06:59:28 GMT

Four approaches are discussed: Variational Bounds of Mutual Information, Density Matching, ProbabilisticClassifier,andDensity-RatioFitting. Proposition3(IJS and its neural estimation, restating Jensen-Shannon bound with f-GAN objective [22]). We adopt the "concatenate critic" design [20, 22, 23] for our neural network parametrized function. NotethatProbabilistic Classifier method applies sigmoid function to the outputs to ensure probabilistic outputs. To proceed, it suffices if we could provide an upper bound forPrS(|lS(θk)| ε/2).

artificial intelligence, epx, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback