AITopics | Wang, Tonghan

Plotting

Wang, Tonghan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Robust Optimization with Diffusion Models for Green Security

Kong, Lingkai, Wang, Haichuan, Pan, Yuqi, Kim, Cheol Woo, Song, Mingxiao, Nguyen, Alayna, Wang, Tonghan, Xu, Haifeng, Tambe, Milind

arXiv.org Artificial IntelligenceFeb-19-2025

In green security, defenders must forecast adversarial behavior, such as poaching, illegal logging, and illegal fishing, to plan effective patrols. These behavior are often highly uncertain and complex. Prior work has leveraged game theory to design robust patrol strategies to handle uncertainty, but existing adversarial behavior models primarily rely on Gaussian processes or linear models, which lack the expressiveness needed to capture intricate behavioral patterns. To address this limitation, we propose a conditional diffusion model for adversary behavior modeling, leveraging its strong distribution-fitting capabilities. To the best of our knowledge, this is the first application of diffusion models in the green security domain. Integrating diffusion models into game-theoretic optimization, however, presents new challenges, including a constrained mixed strategy space and the need to sample from an unnormalized distribution to estimate utilities. To tackle these challenges, we introduce a mixed strategy of mixed strategies and employ a twisted Sequential Monte Carlo (SMC) sampler for accurate sampling. Theoretically, our algorithm is guaranteed to converge to an epsilon equilibrium with high probability using a finite number of iterations and samples. Empirically, we evaluate our approach on both synthetic and real-world poaching datasets, demonstrating its effectiveness.

artificial intelligence, machine learning, mixed strategy, (19 more...)

arXiv.org Artificial Intelligence

2503.0573

Country: Asia > Singapore (0.14)

Genre: Research Report (0.40)

Industry:

Law (1.00)
Leisure & Entertainment > Games (0.66)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
(3 more...)

Add feedback

Policy-to-Language: Train LLMs to Explain Decisions with Flow-Matching Generated Rewards

Yang, Xinyi, Zeng, Liang, Dong, Heng, Yu, Chao, Wu, Xiaoran, Yang, Huazhong, Wang, Yu, Tambe, Milind, Wang, Tonghan

arXiv.org Artificial IntelligenceFeb-17-2025

As humans increasingly share environments with diverse agents powered by RL, LLMs, and beyond, the ability to explain their policies in natural language will be vital for reliable coexistence. In this paper, we build a model-agnostic explanation generator based on an LLM. The technical novelty is that the rewards for training this LLM are generated by a generative flow matching model. This model has a specially designed structure with a hidden layer merged with an LLM to harness the linguistic cues of explanations into generating appropriate rewards. Experiments on both RL and LLM tasks demonstrate that our method can generate dense and effective rewards while saving on expensive human feedback; it thus enables effective explanations and even improves the accuracy of the decisions in original tasks.

explanation, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2502.1253

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On Diffusion Models for Multi-Agent Partial Observability: Shared Attractors, Error Bounds, and Composite Flow

Wang, Tonghan, Dong, Heng, Jiang, Yanchen, Parkes, David C., Tambe, Milind

arXiv.org Artificial IntelligenceNov-19-2024

Multiagent systems grapple with partial observability (PO), and the decentralized POMDP (Dec-POMDP) model highlights the fundamental nature of this challenge. Whereas recent approaches to addressing PO have appealed to deep learning models, providing a rigorous understanding of how these models and their approximation errors affect agents' handling of PO and their interactions remain a challenge. In addressing this challenge, we investigate reconstructing global states from local action-observation histories in Dec-POMDPs using diffusion models. We first find that diffusion models conditioned on local history represent possible states as stable fixed points. In collectively observable (CO) Dec-POMDPs, individual diffusion models conditioned on agents' local histories share a unique fixed point corresponding to the global state, while in non-CO settings, the shared fixed points yield a distribution of possible states given joint history. We further find that, with deep learning approximation errors, fixed points can deviate from true states and the deviation is negatively correlated to the Jacobian rank. Inspired by this low-rank property, we bound the deviation by constructing a surrogate linear regression model that approximates the local behavior of diffusion models. With this bound, we propose a composite diffusion process iterating over agents with theoretical convergence guarantees to the true state.

artificial intelligence, diffusion model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.13953

Country:

North America > United States (0.14)
Europe > Middle East > Cyprus (0.14)
Europe > Germany (0.14)
Asia > China (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

GemNet: Menu-Based, Strategy-Proof Multi-Bidder Auctions Through Deep Learning

Wang, Tonghan, Jiang, Yanchen, Parkes, David C.

arXiv.org Artificial IntelligenceJun-11-2024

Differentiable economics uses deep learning for automated mechanism design. Despite strong progress, it has remained an open problem to learn multi-bidder, general, and fully strategy-proof (SP) auctions. We introduce GEneral Menu-based NETwork (GemNet), which significantly extends the menu-based approach of RochetNet [D\"utting et al., 2023] to the multi-bidder setting. The challenge in achieving SP is to learn bidder-independent menus that are feasible, so that the optimal menu choices for each bidder do not over-allocate items when taken together (we call this menu compatibility). GemNet penalizes the failure of menu compatibility during training, and transforms learned menus after training through price changes, by considering a set of discretized bidder values and reasoning about Lipschitz smoothness to guarantee menu compatibility on the entire value space. This approach is general, leaving undisturbed trained menus that already satisfy menu compatibility and reducing to RochetNet for a single bidder. Mixed-integer linear programs are used for menu transforms and through a number of optimizations, including adaptive grids and methods to skip menu elements, we scale to large auction design problems. GemNet learns auctions with better revenue than affine maximization methods, achieves exact SP whereas previous general multi-bidder methods are approximately SP, and offers greatly enhanced interpretability.

artificial intelligence, bidder, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2406.07428

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)

Add feedback

Social Environment Design

Zhang, Edwin, Zhao, Sadie, Wang, Tonghan, Hossain, Safwan, Gasztowtt, Henry, Zheng, Stephan, Parkes, David C., Tambe, Milind, Chen, Yiling

arXiv.org Machine LearningFeb-21-2024

Artificial Intelligence (AI) holds promise as a technology that can be used to improve government and economic policy-making. This paper proposes a new research agenda towards this end by introducing Social Environment Design, a general framework for the use of AI for automated policy-making that connects with the Reinforcement Learning, EconCS, and Computational Social Choice communities. The framework seeks to capture general economic environments, includes voting on policy objectives, and gives a direction for the systematic analysis of government and economic policy through AI simulation. We highlight key open problems for future research in AI-based policy-making. By solving these challenges, we hope to achieve various social welfare objectives, thereby promoting more ethical and responsible decision making.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Machine Learning

2402.1409

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (0.50)

Industry:

Government (1.00)
Banking & Finance > Economy (0.69)
Leisure & Entertainment > Games (0.68)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.91)
Information Technology > Artificial Intelligence > Cognitive Science (0.89)
(2 more...)

Add feedback

Multi-Sender Persuasion -- A Computational Perspective

Hossain, Safwan, Wang, Tonghan, Lin, Tao, Chen, Yiling, Parkes, David C., Xu, Haifeng

arXiv.org Artificial IntelligenceFeb-7-2024

We consider multiple senders with informational advantage signaling to convince a single self-interested actor towards certain actions. Generalizing the seminal Bayesian Persuasion framework, such settings are ubiquitous in computational economics, multi-agent learning, and machine learning with multiple objectives. The core solution concept here is the Nash equilibrium of senders' signaling policies. Theoretically, we prove that finding an equilibrium in general is PPAD-Hard; in fact, even computing a sender's best response is NP-Hard. Given these intrinsic difficulties, we turn to finding local Nash equilibria. We propose a novel differentiable neural network to approximate this game's non-linear and discontinuous utilities. Complementing this with the extra-gradient algorithm, we discover local equilibria that Pareto dominates full-revelation equilibria and those found by existing neural networks. Broadly, our theoretical and empirical contributions are of interest to a large class of economic problems.

artificial intelligence, machine learning, receiver, (15 more...)

arXiv.org Artificial Intelligence

2402.04971

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Deep Contract Design via Discontinuous Networks

Wang, Tonghan, Dütting, Paul, Ivanov, Dmitry, Talgam-Cohen, Inbal, Parkes, David C.

arXiv.org Artificial IntelligenceOct-27-2023

Contract design involves a principal who establishes contractual agreements about payments for outcomes that arise from the actions of an agent. In this paper, we initiate the study of deep learning for the automated design of optimal contracts. We introduce a novel representation: the Discontinuous ReLU (DeLU) network, which models the principal's utility as a discontinuous piecewise affine function of the design of a contract where each piece corresponds to the agent taking a particular action. DeLU networks implicitly learn closed-form expressions for the incentive compatibility constraints of the agent and the utility maximization objective of the principal, and support parallel inference on each piece through linear programming or interior-point methods that solve for optimal contracts. We provide empirical results that demonstrate success in approximating the principal's utility function with a small number of training samples and scaling to find approximately optimal contracts on problems with a large number of actions and outcomes.

artificial intelligence, contract, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2307.02318

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Never Explore Repeatedly in Multi-Agent Reinforcement Learning

Li, Chenghao, Wang, Tonghan, Zhang, Chongjie, Zhao, Qianchuan

arXiv.org Artificial IntelligenceAug-19-2023

In the realm of multi-agent reinforcement learning, intrinsic motivations have emerged as a pivotal tool for exploration. While the computation of many intrinsic rewards relies on estimating variational posteriors using neural network approximators, a notable challenge has surfaced due to the limited expressive capability of these neural statistics approximators. We pinpoint this challenge as the "revisitation" issue, where agents recurrently explore confined areas of the task space. To combat this, we propose a dynamic reward scaling approach. This method is crafted to stabilize the significant fluctuations in intrinsic rewards in previously explored areas and promote broader exploration, effectively curbing the revisitation phenomenon. Our experimental findings underscore the efficacy of our approach, showcasing enhanced performance in demanding environments like Google Research Football and StarCraft II micromanagement tasks, especially in sparse reward settings.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

arXiv.org Artificial Intelligence

2308.09909

Country: Europe > Italy (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Education (0.46)
Leisure & Entertainment > Games (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Symmetry-Aware Robot Design with Structured Subgroups

Dong, Heng, Zhang, Junyu, Wang, Tonghan, Zhang, Chongjie

arXiv.org Artificial IntelligenceMay-31-2023

Robot design aims at learning to create robots that can be easily controlled and perform tasks efficiently. Previous works on robot design have proven its ability to generate robots for various tasks. However, these works searched the robots directly from the vast design space and ignored common structures, resulting in abnormal robots and poor performance. To tackle this problem, we propose a Symmetry-Aware Robot Design (SARD) framework that exploits the structure of the design space by incorporating symmetry searching into the robot design process. Specifically, we represent symmetries with the subgroups of the dihedral group and search for the optimal symmetry in structured subgroups. Then robots are designed under the searched symmetry. In this way, SARD can design efficient symmetric robots while covering the original design space, which is theoretically analyzed. We further empirically evaluate SARD on various tasks, and the results show its superior efficiency and generalizability.

artificial intelligence, evolutionary algorithm, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2306.00036

Country: North America > United States > Hawaii (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)

Add feedback

Low-Rank Modular Reinforcement Learning via Muscle Synergy

Dong, Heng, Wang, Tonghan, Liu, Jiayuan, Zhang, Chongjie

arXiv.org Artificial IntelligenceOct-26-2022

Modular Reinforcement Learning (RL) decentralizes the control of multi-joint robots by learning policies for each actuator. Previous work on modular RL has proven its ability to control morphologically different agents with a shared actuator policy. However, with the increase in the Degree of Freedom (DoF) of robots, training a morphology-generalizable modular controller becomes exponentially difficult. Motivated by the way the human central nervous system controls numerous muscles, we propose a Synergy-Oriented LeARning (SOLAR) framework that exploits the redundant nature of DoF in robot control. Actuators are grouped into synergies by an unsupervised learning method, and a synergy action is learned to control multiple actuators in synchrony. In this way, we achieve a low-rank control at the synergy level. We extensively evaluate our method on a variety of robot morphologies, and the results show its superior efficiency and generalizability, especially on robots with a large DoF like Humanoids++ and UNIMALs.

machine learning, reinforcement learning, synergy, (19 more...)

arXiv.org Artificial Intelligence

2210.15479

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback