AITopics

Plotting

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Would I Lie To You? Inference Time Alignment of Language Models using Direct Preference Heads

Neural Information Processing SystemsJun-1-2025, 00:06:52 GMT

Pre-trained Language Models (LMs) exhibit strong zero-shot and in-context learning capabilities; however, their behaviors are often difficult to control. By utilizing Reinforcement Learning from Human Feedback (RLHF), it is possible to fine-tune unsupervised LMs to follow instructions and produce outputs that reflect human preferences. Despite its benefits, RLHF has been shown to potentially harm a language model's reasoning capabilities and introduce artifacts such as hallucinations where the model may fabricate facts. To address this issue we introduce Direct Preference Heads (DPH), a fine-tuning framework that enables LMs to learn human preference signals through an auxiliary reward head without directly affecting the output distribution of the language modeling head. We perform a theoretical analysis of our objective function and find strong ties to Conservative Direct Preference Optimization (cDPO). Finally we evaluate our models on GLUE, RACE, and the GPT4All evaluation suite and demonstrate that our method produces models which achieve higher scores than those fine-tuned with Supervised Fine-Tuning (SFT) or Direct Preference Optimization (DPO) alone.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Nonstationary Sparse Spectral Permanental Process Zicheng Sun

Neural Information Processing SystemsMay-31-2025, 23:57:19 GMT

Existing permanental processes often impose constraints on kernel types or stationarity, limiting the model's expressiveness. To overcome these limitations, we propose a novel approach utilizing the sparse spectral representation of nonstationary kernels.

artificial intelligence, kernel, machine learning, (18 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Research Report > Experimental Study (0.93)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Safe Time-Varying Optimization based on Gaussian Processes with Spatio-Temporal Kernel

Neural Information Processing SystemsMay-31-2025, 23:56:59 GMT

Ensuring safety is a key aspect in sequential decision making problems, such as robotics or process control. The complexity of the underlying systems often makes finding the optimal decision challenging, especially when the safety-critical system is time-varying.

artificial intelligence, machine learning, optimization, (19 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States (0.67)

Genre: Research Report > Experimental Study (0.67)

Industry:

Health & Medicine (0.67)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Robots (0.88)

Add feedback

d1419302db9c022ab1d48681b13d5f8b-AuthorFeedback.pdf

Neural Information Processing SystemsMay-31-2025, 23:55:55 GMT

Nash based policy transitivity, are almost entirely general to many-agent settings.

action space, artificial intelligence, diplomacy, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Robots (0.30)

Add feedback

7990ec44fcf3d7a0e5a2add28362213c-AuthorFeedback.pdf

Neural Information Processing SystemsMay-31-2025, 23:55:27 GMT

We are grateful to our three reviewers for their time and insightful comments. "a runtime comparison is needed to understand tradeoffs btw the different choices of projection/decoding sets" This is a great remark, that we will adress. Moreover, quadratic time is still better than the CRF loss, which is intractable. Our provided implementation backs up our claim. We will also report precise runtime figures in the final version.

artificial intelligence, machine learning, polytope, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.51)

Add feedback

Supplementary Material of MMLU-Pro Benchmark Yubo Wang Xueguang Ma

Neural Information Processing SystemsMay-31-2025, 23:47:08 GMT

To illustrate the Dataset documentation and intended uses, we have completed the following fill-in of the Datasheets for Datasets: 2.1 Motivation For what purpose was the dataset created? To provide a more challenging and robust benchmark for multi-task language understanding evaluation. Who created the dataset (e.g., which team, research group) and on behalf of which entity (e.g., company, institution, organization)? TIGER Lab, University of Waterloo Who funded the creation of the dataset? University of Waterloo 2.2 Composition What do the instances that comprise the dataset represent (e.g., documents, photos, people, countries)?

artificial intelligence, dataset, natural language, (12 more...)

Neural Information Processing Systems

Country: North America > Canada > Ontario > Toronto (0.15)

Industry: Law (0.48)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.50)

Add feedback

MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark

Neural Information Processing SystemsMay-31-2025, 23:47:04 GMT

In the age of large-scale language models, benchmarks like the Massive Multitask Language Understanding (MMLU) have been pivotal in pushing the boundaries of what AI can achieve in language comprehension and reasoning across diverse domains. However, as models continue to improve, their performance on these benchmarks has begun to plateau, making it increasingly difficult to discern differences in model capabilities. This paper introduces MMLU-Pro, an enhanced dataset designed to extend the mostly knowledge-driven MMLU benchmark by integrating more challenging, reasoning-focused questions and expanding the choice set from four to ten options. Additionally, MMLU-Pro eliminates the trivial and noisy questions in MMLU. Our experimental results show that MMLU-Pro not only raises the challenge, causing a significant drop in accuracy by 16% to 33% compared to MMLU but also demonstrates greater stability under varying prompts. With 24 different prompt styles tested, the sensitivity of model scores to prompt variations decreased from 4-5% in MMLU to just 2% in MMLU-Pro. Additionally, we found that models utilizing Chain of Thought (CoT) reasoning achieved better performance on MMLU-Pro compared to direct answering, which is in stark contrast to the findings on the original MMLU, indicating that MMLU-Pro includes more complex reasoning questions. Our assessments confirm that MMLU-Pro is a more discriminative benchmark to better track progress in the field. Figure 1: Comparing between MMLU and MMLU-Pro: (Left) Performance gap; (Center) Accuracy distributions affected by 24 prompts, with taller and thinner profiles indicating more stability and shorter and wider profiles indicating greater fluctuations; (Right) Performance using CoT vs. Direct.

arxiv preprint arxiv, large language model, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Europe (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Hierarchical Decision Making by Generating and Following Natural Language Instructions

Hengyuan Hu, Denis Yarats, Qucheng Gong, Yuandong Tian, Mike Lewis

Neural Information Processing SystemsMay-31-2025, 23:46:37 GMT

We explore using natural language instructions as an expressive and compositional representation of complex actions for hierarchical decision making. Rather than directly selecting micro-actions, our agent first generates a plan in natural language, which is then executed by a separate model. We introduce a challenging real-time strategy game environment in which the actions of a large number of units must be coordinated across long time scales. We gather a dataset of 76 thousand pairs of instructions and executions from human play, and train instructor and executor models. Experiments show that models generate intermediate plans in natural langauge significantly outperform models that directly imitate human actions. The compositional structure of language is conducive to learning generalizable action representations.

machine learning, natural language, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

Genre: Overview (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.46)

Add feedback

Grokking of Implicit Reasoning in Transformers: A Mechanistic Journey to the Edge of Generalization

Neural Information Processing SystemsMay-31-2025, 23:45:19 GMT

We study whether transformers can learn to implicitly reason over parametric knowledge, a skill that even the most capable language models struggle with. Focusing on two representative reasoning types, composition and comparison, we consistently find that transformers can learn implicit reasoning, but only through grokking, i.e., extended training far beyond overfitting. The levels of generalization also vary across reasoning types: when faced with out-of-distribution examples, transformers fail to systematically generalize for composition but succeed for comparison. We delve into the model's internals throughout training, conducting analytical experiments that reveal: 1) the mechanism behind grokking, such as the formation of the generalizing circuit and its relation to the relative efficiency of generalizing and memorizing circuits, and 2) the connection between systematicity and the configuration of the generalizing circuit. Our findings guide data and training setup to better induce implicit reasoning and suggest potential improvements to the transformer architecture, such as encouraging cross-layer knowledge sharing. Furthermore, we demonstrate that for a challenging reasoning task with a large search space, GPT-4-Turbo and Gemini-1.5-Pro

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Asia > Middle East > UAE (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SyncTweedies: A General Generative Framework Based on Synchronized Diffusions

Neural Information Processing SystemsMay-31-2025, 23:45:02 GMT

We introduce a general diffusion synchronization framework for generating diverse visual content, including ambiguous images, panorama images, 3D mesh textures, and 3D Gaussian splats textures, using a pretrained image diffusion model. We first present an analysis of various scenarios for synchronizing multiple diffusion processes through a canonical space. Based on the analysis, we introduce a synchronized diffusion method, SyncTweedies, which averages the outputs of Tweedie's formula while conducting denoising in multiple instance spaces. Compared to previous work that achieves synchronization through finetuning, SyncTweedies is a zero-shot method that does not require any finetuning, preserving the rich prior of diffusion models trained on Internet-scale image datasets without overfitting to specific domains. We verify that SyncTweedies offers the broadest applicability to diverse applications and superior performance compared to the previous state-of-the-art for each application. Our project page is at https://synctweedies.github.io.

diffusion model, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Genre: