AITopics | Neural Information Processing Systems

Q: Question-Asking LLMs and a Benchmark for Reliable Interactive Clinical Reasoning

Neural Information Processing SystemsMay-23-2025, 03:19:43 GMT

Users typically engage with LLMs interactively, yet most existing benchmarks evaluate them in a static, single-turn format, posing reliability concerns in interactive scenarios. We identify a key obstacle towards reliability: LLMs are trained to answer any question, even with incomplete context or insufficient knowledge.

information, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.14)
North America > United States (0.14)
Asia > China (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Education (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

A Generalised Jensen Inequality

Neural Information Processing SystemsMay-23-2025, 02:43:50 GMT

In Section 4, we require a version of Jensen's inequality generalised to (possibly) infinite-dimensional vector spaces, because our random variable takes values in H R. Note that this square norm function is indeed convex, since, for any t [0, 1] and any pair f, g H Suppose T is a real Hausdorff locally convex (possibly infinite-dimensional) linear topological space, and let C be a closed convex subset of T. Suppose (Ω, F, P) is a probability space, and V: Ω T a Pettis-integrable random variable such that V (Ω) C. Let f: C [,) be a convex, lower semi-continuous extended-real-valued function such that E We will actually apply generalised Jensen's inequality with conditional expectations, so we need the following theorem. Suppose T is a real Hausdorff locally convex (possibly infinite-dimensional) linear topological space, and let C be a closed convex subset of T. Suppose (Ω, F, P) is a probability space, and V: Ω T a Pettis-integrable random variable such that V (Ω) C. Let f: C [,) be a convex, lower semi-continuous extended-realvalued function such that E Here, (*) and (**) use the properties of conditional expectation of vector-valued random variables given in [12, pp.45-46, Properties 43 and 40 respectively]. The right-hand side is clearly E-measurable, since we have a linear operator on an E-measurable random variable. Now take the supremum of the right-hand side over Q. Then (5) tells us that E [ f(V) | E ] ( f E [ V | E ]), as required.

artificial intelligence, assumption, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Add feedback

Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos

Neural Information Processing SystemsMay-23-2025, 02:43:26 GMT

Pretraining on noisy, internet-scale datasets has been heavily studied as a technique for training models with broad, general capabilities for text, images, and other modalities.

arxiv preprint arxiv, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Materials (0.93)
Leisure & Entertainment > Games > Computer Games (0.49)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

Implicit Regularization in Deep Learning May Not Be Explainable by Norms

Neural Information Processing SystemsMay-23-2025, 02:30:24 GMT

Mathematically characterizing the implicit regularization induced by gradientbased optimization is a longstanding pursuit in the theory of deep learning. A widespread hope is that a characterization based on minimization of norms may apply, and a standard test-bed for studying this prospect is matrix factorization (matrix completion via linear neural networks). It is an open question whether norms can explain the implicit regularization in matrix factorization. The current paper resolves this open question in the negative, by proving that there exist natural matrix factorization problems on which the implicit regularization drives all norms (and quasi-norms) towards infinity. Our results suggest that, rather than perceiving the implicit regularization via norms, a potentially more useful interpretation is minimization of rank. We demonstrate empirically that this interpretation extends to a certain class of non-linear neural networks, and hypothesize that it may be key to explaining generalization in deep learning.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Implicit Regularization in Deep Learning May Not Be Explainable by Norms

Neural Information Processing SystemsMay-23-2025, 02:30:16 GMT

Mathematically characterizing the implicit regularization induced by gradientbased optimization is a longstanding pursuit in the theory of deep learning. A widespread hope is that a characterization based on minimization of norms may apply, and a standard test-bed for studying this prospect is matrix factorization (matrix completion via linear neural networks). It is an open question whether norms can explain the implicit regularization in matrix factorization. The current paper resolves this open question in the negative, by proving that there exist natural matrix factorization problems on which the implicit regularization drives all norms (and quasi-norms) towards infinity. Our results suggest that, rather than perceiving the implicit regularization via norms, a potentially more useful interpretation is minimization of rank. We demonstrate empirically that this interpretation extends to a certain class of non-linear neural networks, and hypothesize that it may be key to explaining generalization in deep learning.

artificial intelligence, deep learning, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

f21e255f89e0f258accbe4e984eef486-AuthorFeedback.pdf

Neural Information Processing SystemsMay-23-2025, 02:30:03 GMT

We thank reviewers for their time and effort! Miscellaneous () Thank you for the positive feedback! Miscellaneous () Thank you for the feedback and support! By this they refute the prospect of norms being implicitly minimized on every convex objective. To our knowledge, very few have endorsed this far-reaching prospect.

artificial intelligence, machine learning, matrix factorization, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Regret in Online Recommendation Systems

Neural Information Processing SystemsMay-23-2025, 02:18:20 GMT

This paper proposes a theoretical analysis of recommendation systems in an online setting, where items are sequentially recommended to users over time. In each round, a user, randomly picked from a population of m users, requests a recommendation. The decision-maker observes the user and selects an item from a catalogue of n items. Importantly, an item cannot be recommended twice to the same user. The probabilities that a user likes each item are unknown. The performance of the recommendation algorithm is captured through its regret, considering as a reference an Oracle algorithm aware of these probabilities. We investigate various structural assumptions on these probabilities: we derive for each structure regret lower bounds, and devise algorithms achieving these limits. Interestingly, our analysis reveals the relative weights of the different components of regret: the component due to the constraint of not presenting the same item twice to the same user, that due to learning the chances users like items, and finally that arising when learning the underlying structure.

algorithm, artificial intelligence, no-repetition constraint, (11 more...)

Neural Information Processing Systems

Country:

Europe > Sweden (0.14)
Asia > South Korea (0.14)
North America > Canada (0.14)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)

Add feedback

List-Decodable Sparse Mean Estimation

Neural Information Processing SystemsMay-23-2025, 00:01:39 GMT

In this paper, we consider that the underlying distribution D is Gaussian with k-sparse mean. Our main contribution is the first polynomial-time algorithm that enjoys sample complexity O poly(k, log d), i.e. poly-logarithmic in the dimension. One of our core algorithmic ingredients is using low-degree sparse polynomials to filter outliers, which may find more applications.

artificial intelligence, machine learning, polynomial, (15 more...)

Neural Information Processing Systems

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Generative Forests

Neural Information Processing SystemsMay-22-2025, 23:18:55 GMT

We focus on generative AI for a type of data that still represent one of the most prevalent form of data: tabular data. Our paper introduces two key contributions: a new powerful class of forest-based models fit for such tasks and a simple training algorithm with strong convergence guarantees in a boosting model that parallels that of the original weak / strong supervised learning setting. This algorithm can be implemented by a few tweaks to the most popular induction scheme for decision tree induction (i.e.

artificial intelligence, experiment, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre: