AITopics | ait

Collaborating Authors

ait

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

In-Context Fully Decentralized Cooperative Multi-Agent Reinforcement Learning

Neural Information Processing SystemsJun-22-2026, 22:27:55 GMT

In this paper, we consider fully decentralized cooperative multi-agent reinforcement learning, where each agent has access only to the states, its local actions, and the shared rewards. The absence of information about other agents' actions typically leads to the non-stationarity problem during per-agent value function updates, and the relative overgeneralization issue during value function estimation. However, existing works fail to address both issues simultaneously, as they lack the capability to model the agents' joint policy in a fully decentralized setting. To overcome this limitation, we propose a simple yet effective method named Return-Aware Context (RAC). RAC formalizes the dynamically changing task, as locally perceived by each agent, as a contextual Markov Decision Process (MDP), and addresses both nonstationarity and relative overgeneralization through return-aware context modeling. Specifically, the contextual MDP attributes the non-stationary local dynamics of each agent to switches between contexts, each corresponding to a distinct joint policy.

agent, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.84)

Add feedback

Optimal Dimension-Free Sampling for Regularized Classification

Alishahi, Meysam, Munteanu, Alexander, Omlor, Simon, Phillips, Jeff M.

arXiv.org Machine LearningMay-25-2026

We prove optimal sampling bounds achieving $(1\pm\varepsilon)$-relative error for a broad class of Lipschitz continuous classification loss functions under various regularization terms. This includes important functions such as logistic and sigmoid loss, hinge loss, and ReLU loss, as prominent and popular representative examples. In particular, we prove $k^2/\varepsilon^2$ upper and lower bounds for $\|\cdot\|_2/k$ regularization, and $k/\varepsilon^2$ upper and lower bounds for $\|\cdot\|_1/k$ regularization. For $\|\cdot\|_2^2/k$ regularization, the sampling complexity depends mainly on a bounded derivative property: if $|g'(x)|\leq g(x)$, and $g(0)>0$, and $g$ is monotonic or convex, then it admits linear in $k$ sampling complexity; otherwise the general bound is $k^2/\varepsilon^2$. However, if $g(0)=0$, our results indicate that no dimension-free bounds are possible, and even sublinear bounds are ruled out. All upper bounds are complemented by matching lower bounds up to polylogarithmic terms. Moreover, our work relies conceptually and algorithmically on simple uniform or (squared) norm sampling and hereby improves over recent cubic $k^3/\varepsilon^2$ sensitivity sampling bounds of (Alishahi and Phillips, ICML'24). This is achieved by refined arguments involving higher moment bounds and empirical process analyses to avoid overcounting that appears in the de-facto standard VC-dimension and sensitivity framework.

artificial intelligence, machine learning, theorem 3, (19 more...)

arXiv.org Machine Learning

2605.23726

Country: North America > United States (0.92)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.66)

Add feedback

Gone Fishing: Neural Active Learning with Fisher Embeddings

Neural Information Processing SystemsFeb-8-2026, 12:57:37 GMT

There is an increasing need for effective active learning algorithms that are compatible with deep neural networks.

artificial intelligence, bayesian inference, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Add feedback

Survey-to-Behavior: Downstream Alignment of Human Values in LLMs via Survey Questions

Nie, Shangrui, Mai, Florian, Kaczér, David, Welch, Charles, Zhao, Zhixue, Flek, Lucie

arXiv.org Artificial IntelligenceAug-18-2025

Large language models implicitly encode preferences over human values, yet steering them often requires large training data. In this work, we investigate a simple approach: Can we reliably modify a model's value system in downstream behavior by training it to answer value survey questions accordingly? We first construct value profiles of several open-source LLMs by asking them to rate a series of value-related descriptions spanning 20 distinct human values, which we use as a baseline for subsequent experiments. We then investigate whether the value system of a model can be governed by fine-tuning on the value surveys. We evaluate the effect of finetuning on the model's behavior in two ways; first, we assess how answers change on in-domain, held-out survey questions. Second, we evaluate whether the model's behavior changes in out-of-domain settings (situational scenarios). To this end, we construct a contextualized moral judgment dataset based on Reddit posts and evaluate changes in the model's behavior in text-based adventure games. We demonstrate that our simple approach can not only change the model's answers to in-domain survey questions, but also produces substantial shifts (value alignment) in implicit downstream task behavior.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2508.11414

Country:

Asia (0.93)
Europe > Austria > Vienna (0.15)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.46)

Industry:

Media > News (0.48)
Health & Medicine (0.46)
Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

It's Not That Simple. An Analysis of Simple Test-Time Scaling

Wu, Guojun

arXiv.org Artificial IntelligenceJul-22-2025

Prior work proposed simple test-time scaling, a method for replicating this scaling behavior with models distilled from o1-like models by manually controlling test-time compute: either scaling down by enforcing a maximum length or scaling up by iteratively appending "Wait" when the model is about to terminate its generation. This paper presents an analysis of simple test-time scaling and finds that the scaling behavior is largely attributed to scaling down by enforcing a maximum length. In contrast, fine-tuning on long CoT data distilled from o1-like models has no significant impact on scaling behavior, and scaling up by appending "Wait" leads to inconsistencies, as the model may oscillate between solutions. A key distinction exists between scaling down by enforcing a maximum length and scaling up test-time compute in o1-like models, such as DeepSeek-R1\@. These models are typically allowed to utilize as much compute as needed, with the only constraint being the model's maximum supported length. By learning to naturally scale up test-time compute during reinforcement learning, o1-like models surpass their peak performance when scaling up. In contrast, simple test-time scaling progressively imposes a lower upper limit on model performance as it scales down. While replicating the test-time scaling behavior of o1 models can be straightforward by scaling down, it is crucial to recognize that the goal of scaling test-time compute is to unlock higher performance -- beyond what the model could originally achieve -- rather than merely reproducing the appearance of scaling behavior.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2507.14419

Country: Asia (0.28)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency

Wang, Chenlong, Feng, Yuanning, Chen, Dongping, Chu, Zhaoyang, Krishna, Ranjay, Zhou, Tianyi

arXiv.org Artificial IntelligenceJun-19-2025

Recent advances in large reasoning models have enabled complex, step-by-step reasoning but often introduce significant overthinking, resulting in verbose and redundant outputs that hinder efficiency. In this study, we examine whether explicit self-reflection, signaled by tokens such as "Wait" and "Hmm", is necessary for advanced reasoning. We propose NoWait, a simple yet effective approach that disables explicit self-reflection by suppressing these tokens during inference. Extensive experiments on ten benchmarks across textual, visual, and video reasoning tasks show that NoWait reduces chain-of-thought trajectory length by up to 27%-51% in five R1-style model series, without compromising model utility. NoWait thus offers a plug-and-play solution for efficient and utility-preserving multimodal reasoning.

arxiv preprint arxiv, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2506.08343

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Add feedback

Reasoning Models Can Be Effective Without Thinking

Ma, Wenjie, He, Jingxuan, Snell, Charlie, Griggs, Tyler, Min, Sewon, Zaharia, Matei

arXiv.org Artificial IntelligenceApr-15-2025

Recent LLMs have significantly improved reasoning capabilities, primarily by including an explicit, lengthy Thinking process as part of generation. In this paper, we question whether this explicit thinking is necessary. Using the state-of-the-art DeepSeek-R1-Distill-Qwen, we find that bypassing the thinking process via simple prompting, denoted as NoThinking, can be surprisingly effective. When controlling for the number of tokens, NoThinking outperforms Thinking across a diverse set of seven challenging reasoning datasets--including mathematical problem solving, formal theorem proving, and coding--especially in low-budget settings, e.g., 51.3 vs. 28.9 on ACM 23 with 700 tokens. Notably, the performance of NoThinking becomes more competitive with pass@k as k increases. Building on this observation, we demonstrate that a parallel scaling approach that uses NoThinking to generate N outputs independently and aggregates them is highly effective. For aggregation, we use task-specific verifiers when available, or we apply simple best-of-N strategies such as confidence-based selection. Our method outperforms a range of baselines with similar latency using Thinking, and is comparable to Thinking with significantly longer latency (up to 9x). Together, our research encourages a reconsideration of the necessity of lengthy thinking processes, while also establishing a competitive reference for achieving strong reasoning performance in low-budget settings or at low latency using parallel scaling.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2504.09858

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

An IoT Framework for Building Energy Optimization Using Machine Learning-based MPC

Morteza, Aryan, Nazari, Hosein K., Pahlevani, Peyman

arXiv.org Artificial IntelligenceAug-23-2024

This study proposes a machine learning-based Model Predictive Control (MPC) approach for controlling Air Handling Unit (AHU) systems by employing an Internet of Things (IoT) framework. The proposed framework utilizes an Artificial Neural Network (ANN) to provide dynamic-linear thermal model parameters considering building information and disturbances in real time, thereby facilitating the practical MPC of the AHU system. The proposed framework allows users to establish new setpoints for a closed-loop control system, enabling customization of the thermal environment to meet individual needs with minimal use of the AHU. The experimental results demonstrate the cost benefits of the proposed machine-learning-based MPC-IoT framework, achieving a 57.59\% reduction in electricity consumption compared with a clock-based manual controller while maintaining a high level of user satisfaction. The proposed framework offers remarkable flexibility and effectiveness, even in legacy systems with limited building information, making it a pragmatic and valuable solution for enhancing the energy efficiency and user comfort in pre-existing structures.

controller, model predictive control, predictive control, (16 more...)

arXiv.org Artificial Intelligence

2408.13294

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Internet of Things (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Communications > Networks > Sensor Networks (0.69)

Add feedback

BAIT: Benchmarking (Embedding) Architectures for Interactive Theorem-Proving

Lamont, Sean, Norrish, Michael, Dezfouli, Amir, Walder, Christian, Montague, Paul

arXiv.org Artificial IntelligenceMar-5-2024

Artificial Intelligence for Theorem Proving has given rise to a plethora of benchmarks and methodologies, particularly in Interactive Theorem Proving (ITP). Research in the area is fragmented, with a diverse set of approaches being spread across several ITP systems. This presents a significant challenge to the comparison of methods, which are often complex and difficult to replicate. Addressing this, we present BAIT, a framework for fair and streamlined comparison of learning approaches in ITP. We demonstrate BAIT's capabilities with an in-depth comparison, across several ITP benchmarks, of state-of-the-art architectures applied to the problem of formula embedding. We find that Structure Aware Transformers perform particularly well, improving on techniques associated with the original problem sets. BAIT also allows us to assess the end-to-end proving performance of systems built on interactive environments. This unified perspective reveals a novel end-to-end system that improves on prior work. We also provide a qualitative analysis, illustrating that improved performance is associated with more semantically-aware embeddings. By streamlining the implementation and comparison of Machine Learning algorithms in the ITP context, we anticipate BAIT will be a springboard for future research.

architecture, benchmark, expression, (14 more...)

arXiv.org Artificial Intelligence

2403.03401

Country: