AITopics | Education

Collaborating Authors

Education

Improved Bayesian Regret Bounds for Thompson Sampling in Reinforcement Learning

Neural Information Processing SystemsApr-27-2026, 01:40:14 GMT

In this paper, we prove the first Bayesian regret bounds for Thompson Sampling in reinforcement learning in a multitude of settings. We simplify the learning problem using a discrete set of surrogate environments, and present a refined analysis of the information ratio using posterior consistency. This leads to an upper bound of order eO(H p dl1T) in the time inhomogeneous reinforcement learning problem where H is the episode length and dl1 is the Kolmogorov l1 dimension of the space of environments. We then find concrete bounds of dl1 in a variety of settings, such as tabular, linear and finite mixtures, and discuss how how our results are either the first of their kind or improve the state-of-the-art.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.34)

Industry: Education > Focused Education > Special Education (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Large Language Models Are Bad Dice Players: LLMs Struggle to Generate Random Numbers from Statistical Distributions

Zhao, Minda, Du, Yilun, Wang, Mengyu

arXiv.org Machine LearningApr-27-2026

As large language models (LLMs) transition from chat interfaces to integral components of stochastic pipelines and systems approaching general intelligence, the ability to faithfully sample from specified probability distributions has become a functional requirement rather than a theoretical curiosity. We present the first large-scale, statistically powered audit of native probabilistic sampling in frontier LLMs, benchmarking 11 models across 15 distributions. To disentangle failure modes, we employ a dual-protocol design: Batch Generation, where a model produces $N{=}1000$ samples within one response, and Independent Requests, comprising $N{=}1000$ stateless calls. We observe a sharp protocol asymmetry: batch generation achieves only modest statistical validity, with a 7% median pass rate, while independent requests collapse almost entirely, with 10 of 11 models passing none of the distributions. Beyond this asymmetry, we reveal that sampling fidelity degrades monotonically with distributional complexity and aggravates as the sampling horizon $N$ increases. Finally, we demonstrate how the propagation of these failures into downstream real-world application tasks introduces systematic biases: models fail to enforce uniform answer-position constraints in Multiple Choice Question generation and systematically violate demographic targets in attribute-constrained text-to-image prompt synthesis. These findings indicate that current LLMs lack a functional internal sampler, necessitating external tools for applications requiring statistical guarantees.

large language model, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

2601.05414

Country: North America > United States (0.15)

Genre: Research Report > New Finding (0.68)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Adversarial Teacher-Student Representation Learning for Domain Generalization

Neural Information Processing SystemsApr-26-2026, 17:58:14 GMT

Domain generalization (DG) aims to transfer the learning task from a single or multiple source domains to unseen target domains. To extract and leverage the information which exhibits sufficient generalization ability, we propose a simple yet effective approach of Adversarial Teacher-Student Representation Learning, with the goal of deriving the domain generalizable representations via generating and exploring out-of-source data distributions. Our proposed framework advances Teacher-Student learning in an adversarial learning manner, which alternates between knowledge-distillation based representation learning and novel-domain data augmentation.

artificial intelligence, generalization, machine learning, (15 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Industry:

Education (0.55)
Information Technology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

3a14ae9951e8153a8fc814b5f506b5b7-Paper-Conference.pdf

Neural Information Processing SystemsApr-26-2026, 17:01:43 GMT

artificial intelligence, knowledge concept, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

College graduate who paid 6-figure fortune for his degree can't find a job

FOX NewsApr-26-2026, 14:54:02 GMT

This material may not be published, broadcast, rewritten, or redistributed. Quotes displayed in real-time or delayed by at least 15 minutes. Market data provided by Factset . Powered and implemented by FactSet Digital Solutions . Mutual Fund and ETF data provided by LSEG .

artificial intelligence, lifestyle real estate tech science, social media, (8 more...)

FOX News

Country:

North America > United States (1.00)
Europe > United Kingdom > England (0.14)

Industry:

Media > News (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.49)
Education > Educational Setting > Higher Education (0.30)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (0.95)

Add feedback

Get Office 2024 & training courses for just 114

PCWorldApr-26-2026, 08:00:00 GMT

When you purchase through links in our articles, we may earn a small commission. Get Microsoft Office 2024 Home & Business plus an 8-course training bundle for hundreds off. Many people use Microsoft Office every day--but not always to its full potential. This bundle pairs Microsoft Office 2024 Home & Business with an 8-course training program designed to close that gap. That includes topics like Excel formulas, workflow efficiency, and even how to integrate tools like ChatGPT into your daily work.

artificial intelligence, gaming laptop mobile monitor pc, machine learning, (10 more...)

PCWorld

Genre: Instructional Material > Course Syllabus & Notes (0.41)

Industry:

Education (0.93)
Information Technology > Security & Privacy (0.82)
Leisure & Entertainment > Games > Computer Games (0.62)

Technology:

Information Technology > Hardware (0.97)
Information Technology > Artificial Intelligence > Machine Learning (0.36)

Add feedback

Curriculum Design for Teaching via Demonstrations: Theory and Applications

Neural Information Processing SystemsApr-26-2026, 00:24:52 GMT

We consider the problem of teaching via demonstrations in sequential decisionmaking settings. In particular, we study how to design a personalized curriculum over demonstrations to speed up the learner's convergence. We provide a unified curriculum strategy for two popular learner models: Maximum Causal Entropy Inverse Reinforcement Learning (MaxEnt-IRL) and Cross-Entropy Behavioral Cloning (CrossEnt-BC). Our unified strategy induces a ranking over demonstrations based on a notion of difficulty scores computed w.r.t. the teacher's optimal policy and the learner's current policy. Compared to the state of the art, our strategy doesn't require access to the learner's internal dynamics and still enjoys similar convergence guarantees under mild technical conditions. Furthermore, we adapt our curriculum strategy to the setting where no teacher agent is present using task-specific difficulty scores. Experiments on a synthetic car driving environment and navigation-based environments demonstrate the effectiveness of our curriculum strategy.

learner, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.28)

Industry: