AITopics | Large Language Model

Collaborating Authors

Large Language Model

News Overviews Instructional Materials AI-Alerts Classics

People training new AI models admit they just get chatbots to do it

New ScientistJun-22-2026, 10:57:59 GMT

The next generation of AI models are meant to be trained by people paid to have conversations with them, but several of these workers have admitted to that they simply get chatbots to do it instead. People who are paid to train new AI models by supplying them with high-quality conversation and tests are cheating and using chatbots like ChatGPT to do the job instead, multiple whistleblowers have told . The seemingly widespread practice risks undermining the future of AI, as it could lead to the "collapse" of more advanced models. Most AI models operating today were trained on text and data scraped from the internet . But as models have scaled up, requiring yet more training data, AI firms have begun using workers who carry out conversations and tests with AI, in the hope that the resulting high-quality data can improve the power and usefulness of future large language models (LLMs). These workers are normally employed by third parties, rather than AI companies directly, and are often working without full-time contracts and for low pay.

artificial intelligence, large language model, natural language, (18 more...)

New Scientist

Industry:

Marketing (0.43)
Law (0.30)
Health & Medicine > Therapeutic Area (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

AutoToM Scaling Model based Mental Inference via Automated Agent Modeling

Neural Information Processing SystemsJun-22-2026, 10:48:41 GMT

Theory of Mind (ToM), the ability to understand people's minds based on their behavior, is key to developing socially intelligent agents. Current approaches to ToM reasoning either rely on prompting Large Language Models (LLMs), which are prone to systematic errors, or use handcrafted, rigid agent models for model-based inference, which are more robust but fail to generalize across domains. In this work, we introduce AutoToM, an automated agent modeling method for scalable, robust, and interpretable mental inference. Given a ToM problem, AutoToM first proposes an initial agent model and then performs automated Bayesian inverse planning based on this model, leveraging an LLM backend.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
(2 more...)

Add feedback

SILENCER: From Discovery to Mitigation of Self-Bias in LLM-as-Benchmark-Generator

Neural Information Processing SystemsJun-22-2026, 10:47:46 GMT

LLM-as-Benchmark-Generator methods have been widely studied as a supplement to human annotators for scalable evaluation, while the potential biases within this paradigm remain underexplored. In this work, we systematically define and validate the phenomenon of inflated performance in models evaluated on their self-generated benchmarks, referred to as self-bias, and attribute it to sub-biases arising from question domain, language style, and wrong labels. On this basis, we propose SILENCER, a general framework that leverages the heterogeneity between multiple generators at both the sample and benchmark levels to neutralize bias and generate high-quality, self-bias-silenced benchmark. Experimental results across various settings demonstrate that SILENCER can suppress self-bias to near zero, significantly improve evaluation effectiveness of the generated benchmark (with an average improvement from 0.655 to 0.833 in Pearson correlation with high-quality human-annotated benchmark), while also exhibiting strong generalizability.

benchmark, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe > Austria (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning to Watermark: ASelective Watermarking Framework for Large Language Models via Multi-Objective Optimization

Neural Information Processing SystemsJun-22-2026, 10:46:50 GMT

The rapid development of LLMs has raised concerns about their potential misuse, leading to various watermarking schemes that typically offer high detectability. However, existing watermarking techniques often face trade-off between watermark detectability and generated text quality. In this paper, we introduce Learning to Watermark (LTW), a novel selective watermarking framework that leverages multi-objective optimization to effectively balance these competing goals. LTW features a lightweight network that adaptively decides when to apply the watermark by analyzing sentence embeddings, token entropy, and current watermarking ratio. Training of the network involves two specifically constructed loss functions that guide the model toward Pareto-optimal solutions, thereby harmonizing watermark detectability and text quality. By integrating LTW with two baseline watermarking methods, our experimental evaluations demonstrate that LTW significantly enhances text quality without compromising detectability. Our selective watermarking approach offers a new perspective for designing watermarks for LLMs and a way to preserve high text quality for watermarks.

large language model, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.67)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Scaling Laws for Optimal Data Mixtures Mustafa Shukor Louis Bethune Dan Busbridge David Grangier Sorbonne University Apple Apple Apple Enrico Fini Alaaeldin El-Nouby Pierre Ablin Apple

Neural Information Processing SystemsJun-22-2026, 10:42:13 GMT

Large foundation models are typically trained on data from multiple domains, with the data mixture-the proportion of each domain used-playing a critical role in model performance. The standard approach to selecting this mixture relies on trial and error, which becomes impractical for large-scale pretraining. We propose a systematic method to determine the optimal data mixture for any target domain using scaling laws. Our approach accurately predicts the loss of a model of size N trained with D tokens and a specific domain weight vector h.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
(2 more...)

Add feedback

Preference Learning with Lie Detectors can Induce Honesty or Evasion

Neural Information Processing SystemsJun-22-2026, 10:41:17 GMT

As AI systems become more capable, deceptive behaviors can undermine evaluation and mislead users at deployment. Recent work has shown that lie detectors can accurately classify deceptive behavior, but they are not typically used in the training pipeline due to concerns around contamination and objective hacking. We examine these concerns by incorporating a lie detector into the labelling step of LLM post-training and evaluating whether the learned policy is genuinely more honest, or instead learns to fool the lie detector while remaining deceptive. Using DolusChat, a novel 65k-example dataset with paired truthful/deceptive responses, we identify three key factors that determine the honesty of learned policies: amount of exploration during preference learning, lie detector accuracy, and KL regularization strength. We find that preference learning with lie detectors and GRPO can lead to policies which evade lie detectors, with deception rates of over 85%. However, if the lie detector true positive rate (TPR) or KL regularization is sufficiently high, GRPO learns honest policies. In contrast, off-policy algorithms (DPO) consistently lead to deception rates under 25% for realistic TPRs. Our results illustrate a more complex picture than previously assumed: depending on the context, lie-detector-enhanced training can be a powerful tool for scalable oversight, or a counterproductive method encouraging undetectable misalignment.

detector, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Instructional Material (1.00)

Industry:

Transportation (1.00)
Telecommunications (1.00)
Law (1.00)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

UMA: AFamily of Universal Models for Atoms

Neural Information Processing SystemsJun-22-2026, 10:32:44 GMT

The ability to quickly and accurately compute properties from atomic simulations is critical for advancing a large number of applications in chemistry and materials science including drug discovery, energy storage, and semiconductor manufacturing. To address this need, we present a family of Universal Models for Atoms (UMA), designed to push the frontier of speed, accuracy, and generalization. UMA models are trained on half a billion unique 3D atomic structures (the largest training runs to date) by compiling data across multiple chemical domains, e.g.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)
Overview (0.67)

Industry:

Information Technology (0.87)
Health & Medicine > Pharmaceuticals & Biotechnology (0.66)
Energy > Energy Storage (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM

Neural Information Processing SystemsJun-22-2026, 10:27:57 GMT

Where does'A man is walking in a Locate the moment where "A man For the query'A man recommends narrow alley, with street noise and Determine the precise timestamp in wearing a white mask is speaking visiting local areas in Tokyo, filming the conversations in the background.

caption, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Oceania > Australia (0.28)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.24)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

bbe04d329d531e8814f1199098bd8fb6-Paper-Conference.pdf

Neural Information Processing SystemsJun-22-2026, 10:27:01 GMT

Visual generative and understanding models typically rely on distinct tokenizers to w process ork.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Add feedback

Tool Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task

Neural Information Processing SystemsJun-22-2026, 10:26:32 GMT

Video Question Answering (VideoQA) task serves as a critical playground for evaluating whether foundation models can effectively perceive, understand, and reason about dynamic real-world scenarios.

large language model, machine learning, question answering, (23 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)
(2 more...)

Add feedback