AITopics

Country: Europe > United Kingdom (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Neural Information Processing SystemsFeb-13-2026, 18:37:40 GMT

tion error; right: surprise. α is a hyperparameter we scanned for. Implement a new IM baseline: ICM (Pathak 2017 [23]

We thank the reviewers for the thorough feedbacks. Based on those, we have made numerous improvements. Original code is for decrete actions.) IM baseline with the random object. The plot is similar to "tool" in Figure 1 and we omit it due to space constraints. Rev. #1 suggested that the environments could be solved by classic planning methods.

artificial intelligence, hyperparameter, new im baseline, (11 more...)

Technology: Information Technology > Artificial Intelligence (0.51)

Neural Information Processing SystemsOct-8-2025, 21:54:04 GMT

A Graphical Terminology An arbitrary graph

artificial intelligence, denote, machine learning, (17 more...)

Country: Europe > United Kingdom (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Kumar, Vanya Bannihatti, Goyal, Divyanshu, Eppa, Akhil, Bhandari, Neel

Curiosity-Driven LLM-as-a-judge for Personalized Creative Judgment

arXiv.org Artificial IntelligenceOct-8-2025

Creative Thinking(TTCW) benchmark introduced in Chakrabarty et al. (2024), Rigorous, standardized evaluation has repeatedly catalyzed progress in machine learning, from ImageNetRussakovsky et al. (2015) and GLUEWang et al. (2019), driving leaps in the fields of computer vision and Natural Language Processing, respectively. The same effect is evident in objective math reasoning, where benchmarks like GSM8KCobbe et al. (2021), together with RL-trained reasoning models such as OpenAI's o1OpenAI et al. (2024) and DeepSeek-R1DeepSeek-AI Models(LLM) as a judge prefer their own generations making them unreliable. As shown in Chakrabarty et al. (2024) and Table 12 and Table 2, even Specifically, when the model is "surprised" by an expert's explanation, it signals a mismatch between the LLM's prior belief and the expert's The intuition behind predicting the annotator is that the model can learn which annotator caused the belief shift, allowing it to calibrate the curiosity signal for each annotator individually, thereby improving personalization. In our experiments, we establish a baseline using an SFT model that predicts annotators' binary More details about the results can be found in Fig 4.Figure 1: Overview of Architecture during training for Curiosity Driven LLM-as-a-judgeFigure 2: Overview of Architecture during inference for Curiosity Driven LLM-as-a-judge 2 (a) Baseline without using explanations (b) Baseline using explanations TTCW dataset Chakrabarty et al. (2024), which is based on the Torrance Test of Creative Thinking Torrance (1966) but adapted for LLMs. All the distinct dimensions in the TTCW dataset are mentioned in Appendix A.1.

explanation, large language model, machine learning, (21 more...)

2510.05135

Genre: Research Report > New Finding (0.48)

Mai, Tan-Ha, Lin, Hsuan-Tien

Intra-Cluster Mixup: An Effective Data Augmentation Technique for Complementary-Label Learning

arXiv.org Artificial IntelligenceSep-23-2025

In this paper, we investigate the challenges of complementary-label learning (CLL), a specialized form of weakly-supervised learning (WSL) where models are trained with labels indicating classes to which instances do not belong, rather than standard ordinary labels. This alternative supervision is appealing because collecting complementary labels is generally cheaper and less labor-intensive. Although most existing research in CLL emphasizes the development of novel loss functions, the potential of data augmentation in this domain remains largely underexplored. In this work, we uncover that the widely-used Mixup data augmentation technique is ineffective when directly applied to CLL. Through in-depth analysis, we identify that the complementary-label noise generated by Mixup negatively impacts the performance of CLL models. We then propose an improved technique called Intra-Cluster Mixup (ICM), which only synthesizes augmented data from nearby examples, to mitigate the noise effect. ICM carries the benefits of encouraging complementary label sharing of nearby examples, and leads to substantial performance improvements across synthetic and real-world labeled datasets. In particular, our wide spectrum of experimental results on both balanced and imbalanced CLL settings justifies the potential of ICM in allying with state-of-the-art CLL algorithms, achieving significant accuracy increases of 30% and 10% on MNIST and CIF AR datasets, respectively.

artificial intelligence, complementary label, machine learning, (15 more...)

2509.17971

Country: North America > Canada (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceAug-25-2025

Empirical Validation of the Independent Chip Model

Kim, Juho

The independent chip model (ICM) forms a cornerstone of all modern poker tournament strategy. However, despite its prominence, the ICM's performance in the real world has not been sufficiently scrutinized, especially at a large scale. In this paper, we introduce our new dataset of poker tournaments, consisting of results of over ten thousand events. Then, using this dataset, we perform two experiments as part of a large-scale empirical validation of the ICM. First, we verify that the ICM performs more accurately than a baseline we propose. Second, we obtain empirical evidence of the ICM underestimating the performances of players with larger stacks while overestimating those who are short-stacked. Our contributions may be useful to future researchers developing new algorithms for estimating a player's value in poker tournaments.

artificial intelligence, icm, tournament, (15 more...)

doi: 10.1109/CoG64752.2025.11114139

2506.0018

Country: North America > United States (0.69)

Genre: Research Report > Experimental Study (0.70)

Industry: Leisure & Entertainment > Games > Poker (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Neural Information Processing SystemsAug-22-2025, 02:13:04 GMT

b6f97e6f0fd175613910d613d574d0cb-AuthorFeedback.pdf

baseline, intrinsic reward, space constraint, (8 more...)

Technology: Information Technology > Artificial Intelligence (0.51)

arXiv.org Machine LearningMay-13-2025

Interpretable Learning Dynamics in Unsupervised Reinforcement Learning

Pandey, Shashwat

We present an interpretability framework for unsupervised reinforcement learning (URL) agents, aimed at understanding how intrinsic motivation shapes attention, behavior, and representation learning. We analyze five agents DQN, RND, ICM, PPO, and a Transformer-RND variant trained on procedurally generated environments, using Grad-CAM, Layer-wise Relevance Propagation (LRP), exploration metrics, and latent space clustering. To capture how agents perceive and adapt over time, we introduce two metrics: attention diversity, which measures the spatial breadth of focus, and attention change rate, which quantifies temporal shifts in attention. Our findings show that curiosity-driven agents display broader, more dynamic attention and exploratory behavior than their extrinsically motivated counterparts. Among them, TransformerRND combines wide attention, high exploration coverage, and compact, structured latent representations. Our results highlight the influence of architectural inductive biases and training signals on internal agent dynamics. Beyond reward-centric evaluation, the proposed framework offers diagnostic tools to probe perception and abstraction in RL agents, enabling more interpretable and generalizable behavior.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Machine Learning

2505.06279

Country: North America > United States > Arizona (0.04)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Geiger, Bernhard C., Kern, Roman

On the Role of Priors in Bayesian Causal Learning

arXiv.org Machine LearningApr-2-2025

--In this work, we investigate causal learning of independent causal mechanisms from a Bayesian perspective. Confirming previous claims from the literature, we show in a didactically accessible manner that unlabeled data (i.e., cause realizations) do not improve the estimation of the parameters defining the mechanism. Furthermore, we observe the importance of choosing an appropriate prior for the cause and mechanism parameters, respectively. Specifically, we show that a factorized prior results in a factorized posterior, which resonates with Janz-ing and Sch olkopf's definition of independent causal mechanisms via the Kolmogorov complexity of the involved distributions and with the concept of parameter independence of Heckerman et al. Impact Statement --Learning the effect from a given cause is an important problem in many engineering disciplines, specifically in the field of surrogate modeling, which aims to reduce the computational cost of numerical simulations. Causal learning, however, cannot make use of unlabeled data - i.e., cause realizations - if the mechanism that produces the effect is independent from the cause. In this work, we recover this well-known fact from a Bayesian perspective.

artificial intelligence, machine learning, realization, (16 more...)

arXiv.org Machine Learning

doi: 10.1109/TAI.2024.3522867

2504.01424

Country:

Europe > Austria > Styria > Graz (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Dominican Republic (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.70)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Zhang, Yuchen, Zhou, Jian

Inverse Flow and Consistency Models

arXiv.org Artificial IntelligenceFeb-16-2025

Inverse generation problems, such as denoising without ground truth observations, is a critical challenge in many scientific inquiries and real-world applications. While recent advances in generative models like diffusion models, conditional flow matching, and consistency models achieved impressive results by casting generation as denoising problems, they cannot be directly used for inverse generation without access to clean data. Here we introduce Inverse Flow (IF), a novel framework that enables using these generative models for inverse generation problems including denoising without ground truth. Inverse Flow can be flexibly applied to nearly any continuous noise distribution and allows complex dependencies. We propose two algorithms for learning Inverse Flows, Inverse Flow Matching (IFM) and Inverse Consistency Model (ICM). Notably, to derive the computationally efficient, simulation-free inverse consistency model objective, we generalized consistency training to any forward diffusion processes or conditional flows, which have applications beyond denoising. We demonstrate the effectiveness of IF on synthetic and real datasets, outperforming prior approaches while enabling noise distributions that previous methods cannot support. Finally, we showcase applications of our techniques to fluorescence microscopy and single-cell genomics data, highlighting IF's utility in scientific problems. Overall, this work expands the applications of powerful generative models to inversion generation problems.

artificial intelligence, inverse flow, machine learning, (15 more...)

2502.11333

Genre: Research Report (0.41)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.66)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)