AITopics | Neural Information Processing Systems

Hard Negative Mixing for Contrastive Learning

Neural Information Processing SystemsMay-23-2025, 04:21:47 GMT

The uniformity experiment is based on Wang and Isola [53]. We follow the same definitions of the losses/metrics as presented in the paper. We set α = 2 and t = 2. All features were L2-normalized, as the metrics are defined on the hypersphere. B.1 Proxy task: Effect of MLP and Stronger Augmentation Following our discussion in Section 3, we wanted to verify that hardness of the proxy task for MoCo [19] is directly correlated to the difficulty of the transformations set, i.e. proxy task hardness can modulated via the positive pair.

artificial intelligence, learning, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
Europe > France (0.14)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Constrained Sampling with Primal-Dual Langevin Monte Carlo

Neural Information Processing SystemsMay-23-2025, 04:16:11 GMT

This work considers the problem of sampling from a probability distribution known up to a normalization constant while satisfying a set of statistical constraints specified by the expected values of general nonlinear functions. This problem finds applications in, e.g., Bayesian inference, where it can constrain moments to evaluate counterfactual scenarios or enforce desiderata such as prediction fairness. Methods developed to handle support constraints, such as those based on mirror maps, barriers, and penalties, are not suited for this task. This work therefore relies on gradient descent-ascent dynamics in Wasserstein space to put forward a discretetime primal-dual Langevin Monte Carlo algorithm (PD-LMC) that simultaneously constrains the target distribution and samples from it. We analyze the convergence of PD-LMC under standard assumptions on the target distribution and constraints, namely (strong) convexity and log-Sobolev inequalities. To do so, we bring classical optimization arguments for saddle-point algorithms to the geometry of Wasserstein space. We illustrate the relevance and effectiveness of PD-LMC in several applications.

artificial intelligence, bayesian inference, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Europe (0.67)
North America > United States (0.27)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Banking & Finance (0.67)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Forecasting Human Trajectory from Scene History Ziyan Wu2 Terrence Chen 2

Neural Information Processing SystemsMay-23-2025, 03:32:55 GMT

Predicting the future trajectory of a person remains a challenging problem, due to randomness and subjectivity of human movement. However, the moving patterns of human in a constrained scenario typically conform to a limited number of regularities to a certain extent, because of the scenario restrictions (e.g., floor plan, roads, and obstacles) and person-person or person-object interactivity. Thus, an individual person in this scenario should follow one of the regularities as well. In other words, a person's subsequent trajectory has likely been traveled by others. Based on this hypothesis, we propose to forecast a person's future trajectory by learning from the implicit scene regularities. We call the regularities, inherently derived from the past dynamics of the people and the environment in the scene, scene history.

artificial intelligence, machine learning, trajectory, (15 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

Q: Question-Asking LLMs and a Benchmark for Reliable Interactive Clinical Reasoning

Neural Information Processing SystemsMay-23-2025, 03:19:43 GMT

Users typically engage with LLMs interactively, yet most existing benchmarks evaluate them in a static, single-turn format, posing reliability concerns in interactive scenarios. We identify a key obstacle towards reliability: LLMs are trained to answer any question, even with incomplete context or insufficient knowledge.

information, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.14)
North America > United States (0.14)
Asia > China (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Education (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

A Generalised Jensen Inequality

Neural Information Processing SystemsMay-23-2025, 02:43:50 GMT

In Section 4, we require a version of Jensen's inequality generalised to (possibly) infinite-dimensional vector spaces, because our random variable takes values in H R. Note that this square norm function is indeed convex, since, for any t [0, 1] and any pair f, g H Suppose T is a real Hausdorff locally convex (possibly infinite-dimensional) linear topological space, and let C be a closed convex subset of T. Suppose (Ω, F, P) is a probability space, and V: Ω T a Pettis-integrable random variable such that V (Ω) C. Let f: C [,) be a convex, lower semi-continuous extended-real-valued function such that E We will actually apply generalised Jensen's inequality with conditional expectations, so we need the following theorem. Suppose T is a real Hausdorff locally convex (possibly infinite-dimensional) linear topological space, and let C be a closed convex subset of T. Suppose (Ω, F, P) is a probability space, and V: Ω T a Pettis-integrable random variable such that V (Ω) C. Let f: C [,) be a convex, lower semi-continuous extended-realvalued function such that E Here, (*) and (**) use the properties of conditional expectation of vector-valued random variables given in [12, pp.45-46, Properties 43 and 40 respectively]. The right-hand side is clearly E-measurable, since we have a linear operator on an E-measurable random variable. Now take the supremum of the right-hand side over Q. Then (5) tells us that E [ f(V) | E ] ( f E [ V | E ]), as required.

artificial intelligence, assumption, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Add feedback

Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos

Neural Information Processing SystemsMay-23-2025, 02:43:26 GMT

Pretraining on noisy, internet-scale datasets has been heavily studied as a technique for training models with broad, general capabilities for text, images, and other modalities.

arxiv preprint arxiv, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Materials (0.93)
Leisure & Entertainment > Games > Computer Games (0.49)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

Implicit Regularization in Deep Learning May Not Be Explainable by Norms

Neural Information Processing SystemsMay-23-2025, 02:30:24 GMT

Mathematically characterizing the implicit regularization induced by gradientbased optimization is a longstanding pursuit in the theory of deep learning. A widespread hope is that a characterization based on minimization of norms may apply, and a standard test-bed for studying this prospect is matrix factorization (matrix completion via linear neural networks). It is an open question whether norms can explain the implicit regularization in matrix factorization. The current paper resolves this open question in the negative, by proving that there exist natural matrix factorization problems on which the implicit regularization drives all norms (and quasi-norms) towards infinity. Our results suggest that, rather than perceiving the implicit regularization via norms, a potentially more useful interpretation is minimization of rank. We demonstrate empirically that this interpretation extends to a certain class of non-linear neural networks, and hypothesize that it may be key to explaining generalization in deep learning.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Implicit Regularization in Deep Learning May Not Be Explainable by Norms

Neural Information Processing SystemsMay-23-2025, 02:30:16 GMT

Mathematically characterizing the implicit regularization induced by gradientbased optimization is a longstanding pursuit in the theory of deep learning. A widespread hope is that a characterization based on minimization of norms may apply, and a standard test-bed for studying this prospect is matrix factorization (matrix completion via linear neural networks). It is an open question whether norms can explain the implicit regularization in matrix factorization. The current paper resolves this open question in the negative, by proving that there exist natural matrix factorization problems on which the implicit regularization drives all norms (and quasi-norms) towards infinity. Our results suggest that, rather than perceiving the implicit regularization via norms, a potentially more useful interpretation is minimization of rank. We demonstrate empirically that this interpretation extends to a certain class of non-linear neural networks, and hypothesize that it may be key to explaining generalization in deep learning.

artificial intelligence, deep learning, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

f21e255f89e0f258accbe4e984eef486-AuthorFeedback.pdf

Neural Information Processing SystemsMay-23-2025, 02:30:03 GMT

We thank reviewers for their time and effort! Miscellaneous () Thank you for the positive feedback! Miscellaneous () Thank you for the feedback and support! By this they refute the prospect of norms being implicitly minimized on every convex objective. To our knowledge, very few have endorsed this far-reaching prospect.

artificial intelligence, machine learning, matrix factorization, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Regret in Online Recommendation Systems

Neural Information Processing SystemsMay-23-2025, 02:18:20 GMT

This paper proposes a theoretical analysis of recommendation systems in an online setting, where items are sequentially recommended to users over time. In each round, a user, randomly picked from a population of m users, requests a recommendation. The decision-maker observes the user and selects an item from a catalogue of n items. Importantly, an item cannot be recommended twice to the same user. The probabilities that a user likes each item are unknown. The performance of the recommendation algorithm is captured through its regret, considering as a reference an Oracle algorithm aware of these probabilities. We investigate various structural assumptions on these probabilities: we derive for each structure regret lower bounds, and devise algorithms achieving these limits. Interestingly, our analysis reveals the relative weights of the different components of regret: the component due to the constraint of not presenting the same item twice to the same user, that due to learning the chances users like items, and finally that arising when learning the underlying structure.

algorithm, artificial intelligence, no-repetition constraint, (11 more...)

Neural Information Processing Systems

Country: