AITopics | mpe

Collaborating Authors

mpe

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Model Reconciliation via Cost-Optimal Explanations in Probabilistic Logic Programming

Neural Information Processing SystemsJun-17-2026, 06:38:10 GMT

In human-AI interaction, effective communication relies on aligning the AI agent's model with the human user's mental model, a process known as model reconciliation. However, existing model reconciliation approaches predominantly assume deterministic models, overlooking the fact that human knowledge is often uncertain or probabilistic. To bridge this gap, we present a probabilistic model reconciliation framework that resolves inconsistencies in MPE outcome probabilities between an agent's and a user's models. Our approach is built on probabilistic logic programming (PLP) using ProbLog, where explanations are generated as cost-optimal model updates that reconcile these probabilistic differences. We develop two search algorithms - a generic baseline and an optimized version. The latter is guided by theoretical insights and further extended with greedy and weighted variants to enhance scalability and efficiency. Our approach is validated through a user study on explanation types and computational experiments showing that the optimized version consistently outperforms the generic baseline.

artificial intelligence, explanation, logic & formal reasoning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre: Research Report > Experimental Study (1.00)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

Add feedback

Decomposition of Spillover Effects Under Misspecification:Pseudo-true Estimands and a Local--Global Extension

Park, Yechan, Yang, Xiaodong

arXiv.org Machine LearningFeb-13-2026

Applied work with interference typically models outcomes as functions of own treatment and a low-dimensional exposure mapping of others' treatments, even when that mapping may be misspecified. This raises a basic question: what policy object are exposure-based estimands implicitly targeting, and how should we interpret their direct and spillover components relative to the underlying policy question? We take as primitive the marginal policy effect, defined as the effect of a small change in the treatment probability under the actual experimental design, and show that any researcher-chosen exposure mapping induces a unique pseudo-true outcome model. This model is the best approximation to the underlying potential outcomes that depends only on the user-chosen exposure. Utilizing that representation, the marginal policy effect admits a canonical decomposition into exposure-based direct and spillover effects, and each component provides its optimal approximation to the corresponding oracle objects that would be available if interference were fully known. We then focus on a setting that nests important empirical and theoretical applications in which both local network spillovers and global spillovers, such as market equilibrium, operate. There, the marginal policy effect further decomposes asymptotically into direct, local, and global channels. An important implication is that many existing methods are more robust than previously understood once we reinterpret their targets as channel-specific components of this pseudo-true policy estimand. Simulations and a semi-synthetic experiment calibrated to a large cash-transfer experiment show that these components can be recovered in realistic experimental designs.

artificial intelligence, exposure mapping, machine learning, (16 more...)

arXiv.org Machine Learning

2602.12023

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
South America > Chile (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)

Industry:

Banking & Finance (1.00)
Health & Medicine > Therapeutic Area (0.68)
Health & Medicine > Epidemiology (0.46)
Education > Educational Setting (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.67)

Add feedback

8d5f526a31d3731a30eb58d5874cf5b1-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 17:02:19 GMT

Note that given access to population of positives and unlabeled, α can be estimated as minxpupxq{pppxq. To make a prediction on test point from unlabeled data, we can then use Bayes rule to obtain the following transformation on probabilistic output ofthe domain discriminator:f " α In particular, for each classj PYs, we can first estimate its prevalencepαj in the unlabeled target. Forclassification,we can traink PU learning classifiersfi, wherefi is trained to classify a source classi versus others in target. Assuming that eachfj returns a score betweenr0,1s, during test time, an examplex is classifiedasfpxqgivenby fpxq" " Note that mathematically any OSLS problems can be thought of ask-PU problems as per(10). Put simply,for individual PU problems defined for source classesj PYs,we need existence of a sub-domainXj suchthatweonlyobserveexample forthatclassjinXj. This error incurred due to bias can be mild forasingle mixture proportion estimation taskbutaccumulates withincreasing number ofclasses (i.e.,k). Assume that there exists aunique solutionptpyq. Without loss of generality, we assume that|Xwp| " k.

artificial intelligence, machine learning, pqt, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

How Learnable Grids Recover Fine Detail in Low Dimensions: A Neural Tangent Kernel Analysis of Multigrid Parametric Encodings

Audia, Samuel, Feizi, Soheil, Zwicker, Matthias, Manocha, Dinesh

arXiv.org Artificial IntelligenceApr-21-2025

Neural networks that map between low dimensional spaces are ubiquitous in computer graphics and scientific computing; however, in their naive implementation, they are unable to learn high frequency information. We present a comprehensive analysis comparing the two most common techniques for mitigating this spectral bias: Fourier feature encodings (FFE) and multigrid parametric encodings (MPE). FFEs are seen as the standard for low dimensional mappings, but MPEs often outperform them and learn representations with higher resolution and finer detail. FFE's roots in the Fourier transform, make it susceptible to aliasing if pushed too far, while MPEs, which use a learned grid structure, have no such limitation. To understand the difference in performance, we use the neural tangent kernel (NTK) to evaluate these encodings through the lens of an analogous kernel regression. By finding a lower bound on the smallest eigenvalue of the NTK, we prove that MPEs improve a network's performance through the structure of their grid and not their learnable embedding. This mechanism is fundamentally different from FFEs, which rely solely on their embedding space to improve performance. Results are empirically validated on a 2D image regression task using images taken from 100 synonym sets of ImageNet and 3D implicit surface regression on objects from the Stanford graphics dataset. Using peak signal-to-noise ratio (PSNR) and multiscale structural similarity (MS-SSIM) to evaluate how well fine details are learned, we show that the MPE increases the minimum eigenvalue by 8 orders of magnitude over the baseline and 2 orders of magnitude over the FFE. The increase in spectrum corresponds to a 15 dB (PSNR) / 0.65 (MS-SSIM) increase over baseline and a 12 dB (PSNR) / 0.33 (MS-SSIM) increase over the FFE.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2504.13412

Country: North America > United States (0.67)

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Hardware (0.93)
(2 more...)

Add feedback

Pioneering Reliable Assessment in Text-to-Image Knowledge Editing: Leveraging a Fine-Grained Dataset and an Innovative Criterion

Gu, Hengrui, Zhou, Kaixiong, Wang, Yili, Wang, Ruobing, Wang, Xin

arXiv.org Artificial IntelligenceOct-26-2024

During pre-training, the Text-to-Image (T2I) diffusion models encode factual knowledge into their parameters. These parameterized facts enable realistic image generation, but they may become obsolete over time, thereby misrepresenting the current state of the world. Knowledge editing techniques aim to update model knowledge in a targeted way. However, facing the dual challenges posed by inadequate editing datasets and unreliable evaluation criterion, the development of T2I knowledge editing encounter difficulties in effectively generalizing injected knowledge. In this work, we design a T2I knowledge editing framework by comprehensively spanning on three phases: First, we curate a dataset \textbf{CAKE}, comprising paraphrase and multi-object test, to enable more fine-grained assessment on knowledge generalization. Second, we propose a novel criterion, \textbf{adaptive CLIP threshold}, to effectively filter out false successful images under the current criterion and achieve reliable editing evaluation. Finally, we introduce \textbf{MPE}, a simple but effective approach for T2I knowledge editing. Instead of tuning parameters, MPE precisely recognizes and edits the outdated part of the conditioning text-prompt to accommodate the up-to-date knowledge. A straightforward implementation of MPE (Based on in-context learning) exhibits better overall performance than previous model editors. We hope these efforts can further promote faithful evaluation of T2I knowledge editing methods.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2409.17928

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Reinforcement Learning with Quasi-Hyperbolic Discounting

Eshwar, S. R., Motwani, Mayank, Roy, Nibedita, Thoppe, Gugan

arXiv.org Artificial IntelligenceSep-16-2024

Reinforcement learning has traditionally been studied with exponential discounting or the average reward setup, mainly due to their mathematical tractability. However, such frameworks fall short of accurately capturing human behavior, which has a bias towards immediate gratification. Quasi-Hyperbolic (QH) discounting is a simple alternative for modeling this bias. Unlike in traditional discounting, though, the optimal QH-policy, starting from some time $t_1,$ can be different to the one starting from $t_2.$ Hence, the future self of an agent, if it is naive or impatient, can deviate from the policy that is optimal at the start, leading to sub-optimal overall returns. To prevent this behavior, an alternative is to work with a policy anchored in a Markov Perfect Equilibrium (MPE). In this work, we propose the first model-free algorithm for finding an MPE. Using a two-timescale analysis, we show that, if our algorithm converges, then the limit must be an MPE. We also validate this claim numerically for the standard inventory system with stochastic demands. Our work significantly advances the practical application of reinforcement learning.

algorithm, discounting, q-value function, (16 more...)

arXiv.org Artificial Intelligence

2409.10583

Country:

Asia > India > Karnataka > Bengaluru (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

MASP: Scalable GNN-based Planning for Multi-Agent Navigation

Yang, Xinyi, Yang, Xinting, Yu, Chao, Chen, Jiayu, Yang, Huazhong, Wang, Yu

arXiv.org Artificial IntelligenceDec-5-2023

We investigate the problem of decentralized multi-agent navigation tasks, where multiple agents need to reach initially unassigned targets in a limited time. Classical planning-based methods suffer from expensive computation overhead at each step and offer limited expressiveness for complex cooperation strategies. In contrast, reinforcement learning (RL) has recently become a popular paradigm for addressing this issue. However, RL struggles with low data efficiency and cooperation when directly exploring (nearly) optimal policies in the large search space, especially with an increased agent number (e.g., 10+ agents) or in complex environments (e.g., 3D simulators). In this paper, we propose Multi-Agent Scalable GNN-based P lanner (MASP), a goal-conditioned hierarchical planner for navigation tasks with a substantial number of agents. MASP adopts a hierarchical framework to divide a large search space into multiple smaller spaces, thereby reducing the space complexity and accelerating training convergence. We also leverage graph neural networks (GNN) to model the interaction between agents and goals, improving goal achievement. Besides, to enhance generalization capabilities in scenarios with unseen team sizes, we divide agents into multiple groups, each with a previously trained number of agents. The results demonstrate that MASP outperforms classical planning-based competitors and RL baselines, achieving a nearly 100% success rate with minimal training data in both multi-agent particle environments (MPE) with 50 agents and a quadrotor 3-dimensional environment (OmniDrones) with 20 agents. Furthermore, the learned policy showcases zero-shot generalization across unseen team sizes.

agent, masp, scenario, (16 more...)

arXiv.org Artificial Intelligence

2312.02522

Country: Asia > China > Beijing > Beijing (0.04)

Genre:

Research Report (0.70)
Workflow (0.50)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.88)

Add feedback

${\rm E}(3)$-Equivariant Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning

Chen, Dingyang, Zhang, Qi

arXiv.org Artificial IntelligenceAug-22-2023

Identification and analysis of symmetrical patterns in the natural world have led to significant discoveries across various scientific fields, such as the formulation of gravitational laws in physics and advancements in the study of chemical structures. In this paper, we focus on exploiting Euclidean symmetries inherent in certain cooperative multi-agent reinforcement learning (MARL) problems and prevalent in many applications. We begin by formally characterizing a subclass of Markov games with a general notion of symmetries that admits the existence of symmetric optimal values and policies. Motivated by these properties, we design neural network architectures with symmetric constraints embedded as an inductive bias for multi-agent actor-critic methods. This inductive bias results in superior performance in various cooperative MARL benchmarks and impressive generalization capabilities such as zero-shot learning and transfer learning in unseen scenarios with repeated symmetric patterns. The code is available at: https://github.com/dchen48/E3AC.

artificial intelligence, machine learning, training step, (18 more...)

arXiv.org Artificial Intelligence

2308.11842

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

On the ethics of constructing conscious AI

Edelman, Shimon

arXiv.org Artificial IntelligenceMar-13-2023

In its pragmatic turn, the new discipline of AI ethics came to be dominated by humanity's collective fear of its creatures, as reflected in an extensive and perennially popular literary tradition. Dr. Frankenstein's monster in the novel by Mary Shelley rising against its creator; the unorthodox golem in H. Leivick's 1920 play going on a rampage; the rebellious robots of Karel \v{C}apek -- these and hundreds of other examples of the genre are the background against which the preoccupation of AI ethics with preventing robots from behaving badly towards people is best understood. In each of these three fictional cases (as well as in many others), the miserable artificial creature -- mercilessly exploited, or cornered by a murderous mob, and driven to violence in self-defense -- has its author's sympathy. In real life, with very few exceptions, things are different: theorists working on the ethics of AI completely ignore the possibility of robots needing protection from their creators. The present book chapter takes up this, less commonly considered, ethical angle of AI.

artificial intelligence, consciousness, metzinger, (16 more...)

arXiv.org Artificial Intelligence

2303.07439

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > New York > New York County > New York City (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(6 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Robustness and sample complexity of model-based MARL for general-sum Markov games

Subramanian, Jayakumar, Sinha, Amit, Mahajan, Aditya

arXiv.org Artificial IntelligenceDec-19-2022

Multi-agent reinforcement learning (MARL) is often modeled using the framework of Markov games (also called stochastic games or dynamic games). Most of the existing literature on MARL concentrates on zero-sum Markov games but is not applicable to general-sum Markov games. It is known that the best-response dynamics in general-sum Markov games are not a contraction. Therefore, different equilibria in general-sum Markov games can have different values. Moreover, the Q-function is not sufficient to completely characterize the equilibrium. Given these challenges, model based learning is an attractive approach for MARL in general-sum Markov games. In this paper, we investigate the fundamental question of \emph{sample complexity} for model-based MARL algorithms in general-sum Markov games. We show two results. We first use Hoeffding inequality based bounds to show that $\tilde{\mathcal{O}}( (1-\gamma)^{-4} \alpha^{-2})$ samples per state-action pair are sufficient to obtain a $\alpha$-approximate Markov perfect equilibrium with high probability, where $\gamma$ is the discount factor, and the $\tilde{\mathcal{O}}(\cdot)$ notation hides logarithmic terms. We then use Bernstein inequality based bounds to show that $\tilde{\mathcal{O}}( (1-\gamma)^{-1} \alpha^{-2} )$ samples are sufficient. To obtain these results, we study the robustness of Markov perfect equilibrium to model approximations. We show that the Markov perfect equilibrium of an approximate (or perturbed) game is always an approximate Markov perfect equilibrium of the original game and provide explicit bounds on the approximation error. We illustrate the results via a numerical example.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2110.02355

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(6 more...)

Genre: Overview (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback