AITopics | curr

Collaborating Authors

curr

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CODA: Coordination via On-Policy Diffusion for Multi-Agent Offline Reinforcement Learning

Hedman, Marcel, Tessera, Kale-ab Abebe, Formanek, Juan Claude, Sims, Anya, Zamboni, Riccardo, McInroe, Trevor, Torr, John, Fosong, Elliot

arXiv.org Machine LearningApr-28-2026

Offline multi-agent reinforcement learning (MARL) enables policy learning from fixed datasets, but is prone to coordination failure: agents trained on static, off-policy data converge to suboptimal joint behaviours because they cannot co-adapt as their policies change. We introduce CODA (Coordination via On-Policy Diffusion for Multi-Agent Reinforcement Learning), a diffusion-based multi-agent trajectory generator for data augmentation that samples conditioned on the current joint policy, producing synthetic experience which reflects the evolving behaviours of the agents, thereby providing a mechanism for co-adaptation. We find that previous diffusion-based augmentation approaches are insufficient for fostering multi-agent coordination because they produce static augmented datasets that do not evolve as the current joint policy changes during training; CODA resolves this by more closely simulating on-policy learning and is a meaningful step toward coordinated behaviours in the offline setting. CODA is algorithm-agnostic and can be layered onto both model-free and model-based offline reinforcement learning pipelines as an augmentation module. Empirically, CODA not only resolves canonical coordination pathologies in continuous polynomial games but also delivers strong results on the more complex MaMuJoCo continuous-control benchmarks.

machine learning, reinforcement learning, trajectory, (15 more...)

arXiv.org Machine Learning

2604.23308

Country:

Europe (0.67)
North America > United States (0.46)

Genre: Research Report (0.50)

Industry:

Education (0.46)
Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

APerformer architecture details

Neural Information Processing SystemsApr-25-2026, 11:00:33 GMT

We define the Performer architecture formally as follows. V Rdmodel d are trainable parameters (separate for each instance of MultiHead-Att, FFN), "+" is broadcasted rowwise when biases are added and LN is layer normalization [2], which is applied rowwise and depends on additional trainable parameters. GeLU denotes Gaussian error Linear Unit [16], which is applied elementwise. Similarly, U(n) does not affect L(1),...,L(n), so This way, the 3D tensor R RL d M is not stored in memory explicitly, resulting in O(L) time and O(L(d+ M) + dM) memory complexity. In order to have the same memory consumption during back-propagation, [18] propose the following routine.

artificial intelligence, curr, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.36)

Add feedback

b922ede9c9eb9eabec1c1fecbdecb45d-Supplemental.pdf

Neural Information Processing SystemsFeb-10-2026, 21:32:45 GMT

curr, dataset, node, (16 more...)

Neural Information Processing Systems

Country:

Asia > China > Heilongjiang Province > Harbin (0.08)
Asia > China > Beijing > Beijing (0.06)
Asia > China > Sichuan Province > Chengdu (0.05)

Industry: Transportation (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

b922ede9c9eb9eabec1c1fecbdecb45d-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 21:32:41 GMT

Predicting the most likely route from asource location to adestination is acore functionality inmapping services. Although theproblem hasbeenstudied inthe literature, two key limitations remain to be addressed.

artificial intelligence, curr, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Asia > China > Heilongjiang Province > Harbin (0.05)
Asia > China > Beijing > Beijing (0.05)
Asia > China > Sichuan Province > Chengdu (0.05)

Industry: Transportation > Ground > Road (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.47)

Add feedback

90e73f3cf1a6c84c723a2e8b7fb2b2c1-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 19:11:16 GMT

orig, participant, performative power, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.69)

Add feedback

AStochastic Path-Integrated Differential EstimatoR Expectation Maximization Algorithm

Neural Information Processing SystemsFeb-10-2026, 06:04:10 GMT

Itcanbeshown (see [18] andthesupplementary material) thatKsEM-VROpt(n, )= KFIEMOpt(n, )= n2/3O( 1) and KsEM-VRCE (n, )= KFIEMCE (n, )= n+n2/3O( 1). SPIDER estimator employed, Proof Sketch.Whilewe

artificial intelligence, bst, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.05)
Asia > China > Hong Kong (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(6 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

35309226eb45ec366ca86a4329a2b7c3-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 05:06:33 GMT

algorithm, complexity, penn treebank, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.50)

Add feedback

Vectorized Online POMDP Planning

Hoerger, Marcus, Sudrajat, Muhammad, Kurniawati, Hanna

arXiv.org Artificial IntelligenceDec-1-2025

-- Planning under partial observability is an essential capability of autonomous robots. The Partially Observable Markov Decision Process (POMDP) provides a powerful framework for planning under partial observability problems, capturing the stochastic effects of actions and the limited information available through noisy observations. POMDP solving could benefit tremendously from massive parallelization on today's hardware, but parallelizing POMDP solvers has been challenging. Most of these solvers rely on interleaving numerical optimization over actions with the estimation of their values, which creates dependencies and synchronization bottlenecks between parallel processes that can offset the benefits of paral-lelization. In this paper, we propose V ectorized Online POMDP Planner (VOPP), a novel parallel online solver that leverages a recent POMDP formulation which analytically solves part of the optimization component, leaving numerical computations to consist of only estimation of expectations. VOPP represents all data structures related to planning as a collection of tensors, and implements all planning steps as fully vectorized computations over this representation. The result is a massively parallel solver with no dependencies or synchronization bottlenecks between parallel processes. Experimental results indicate that VOPP is at least 20 more efficient in computing near-optimal solutions compared to an existing state-of-the-art parallel online solver .

artificial intelligence, machine learning, vopp, (18 more...)

arXiv.org Artificial Intelligence

2510.27191

Genre:

Workflow (1.00)
Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Revisiting Model Inversion Evaluation: From Misleading Standards to Reliable Privacy Assessment

Ho, Sy-Tuyen, Hao, Koh Jun, Nguyen, Ngoc-Bao, Binder, Alexander, Cheung, Ngai-Man

arXiv.org Artificial IntelligenceNov-21-2025

Model Inversion (MI) attacks aim to reconstruct information from private training data by exploiting access to machine learning models T. To evaluate such attacks, the standard evaluation framework relies on an evaluation model E, trained under the same task design as T. This framework has become the de facto standard for assessing progress in MI research, used across nearly all recent MI studies without question. In this paper, we present the first in-depth study of this evaluation framework. In particular, we identify a critical issue of this standard framework: Type-I adversarial examples. These are reconstructions that do not capture the visual features of private training data, yet are still deemed successful by T and ultimately transferable to E. Such false positives undermine the reliability of the standard MI evaluation framework. To address this issue, we introduce a new MI evaluation framework that replaces the evaluation model E with advanced Multimodal Large Language Models (MLLMs). By leveraging their general-purpose visual understanding, our MLLM-based framework does not depend on training of shared task design as in T, thus reducing Type-I transferability and providing more faithful assessments of reconstruction success. Using our MLLM-based evaluation framework, we reevaluate 27 diverse MI attack setups and empirically reveal consistently high false positive rates under the standard evaluation framework. Importantly, we demonstrate that many state-of-the-art (SOTA) MI methods report inflated attack accuracy, indicating that actual privacy leakage is significantly lower than previously believed. By uncovering this critical issue and proposing a robust solution, our work enables a reassessment of progress in MI research and sets a new standard for reliable and robust evaluation. Code can be found in https://github.com/hosytuyen/MI-Eval-MLLM

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.03519

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)

Add feedback

Image-POSER: Reflective RL for Multi-Expert Image Generation and Editing

Mohebbi, Hossein, Abdulrahman, Mohammed, Miao, Yanting, Poupart, Pascal, Kothawade, Suraj

arXiv.org Artificial IntelligenceNov-18-2025

Recent advances in text-to-image generation have produced strong single-shot models, yet no individual system reliably executes the long, compositional prompts typical of creative workflows. We introduce Image-POSER, a reflective reinforcement learning framework that (i) orchestrates a diverse registry of pretrained text-to-image and image-to-image experts, (ii) handles long-form prompts end-to-end through dynamic task decomposition, and (iii) supervises alignment at each step via structured feedback from a vision-language model critic. By casting image synthesis and editing as a Markov Decision Process, we learn non-trivial expert pipelines that adaptively combine strengths across models. Experiments show that Image-POSER outperforms baselines, including frontier models, across industry-standard and custom benchmarks in alignment, fidelity, and aesthetics, and is consistently preferred in human evaluations. These results highlight that reinforcement learning can endow AI systems with the capacity to autonomously decompose, reorder, and combine visual models, moving towards general-purpose visual assistants.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2511.1178

Country: North America > Canada (0.28)

Genre: