AITopics | Dou, Zehao

Collaborating Authors

Dou, Zehao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Monitoring Reasoning Models for Misbehavior and the Risks of Promoting Obfuscation

Baker, Bowen, Huizinga, Joost, Gao, Leo, Dou, Zehao, Guan, Melody Y., Madry, Aleksander, Zaremba, Wojciech, Pachocki, Jakub, Farhi, David

arXiv.org Artificial IntelligenceMar-14-2025

Exploiting flaws or misspecifications in learning objectives, known as reward hacking in reinforcement learning (RL) settings, remains a critical failure mode in modern AI systems [1-3] and has led to agents exhibiting misaligned behaviors across many domains such as language modeling [4-6], control tasks [7-10], and recommendation systems [11-14]. The true objective function we wish to optimize is often hard to write down precisely, and so the challenge in creating capable and aligned systems largely lies in designing robust proxies that do not deviate in ways that a model may learn to exploit [3, 15]. This problem is not unique to machine learning systems but has also plagued human institutions [16-19]. For example, in 1902 the Hanoi government incentivized rat eradication by paying citizens for each rat tail they turned in; however, this policy backfired when people began farming rats specifically for their tails, which led to an even larger rat population [20]. Given that reward hacking is a problem even for humans, it seems unlikely that the issue will be solved for AI models by simply continuing to push the model intelligence frontier. In fact, enhancing an agent's capabilities may exacerbate the problem by better equipping it to discover and execute more complex and hard-to-monitor exploits [21, 22]. We have found this to be anecdotally true: as we have continued to scale RL training, agents have discovered more complex and hard-to-detect hacks. Thus far, the only general strategy for mitigating reward hacking is to manually monitor agents for unintended behavior, which is unlikely to scale as their outputs and actions grow more complex--possibly even superhuman--and become more widely used. The emerging generation of large language models (LLMs) [23] that are trained with reinforcement learning to reason in chain-of-thought (CoT) [24-26] offers a promising new avenue for monitoring.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.11926

Country: Asia > Vietnam > Hanoi > Hanoi (0.24)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.91)

Add feedback

Think Twice Before You Act: Improving Inverse Problem Solving With MCMC

Zhu, Yaxuan, Dou, Zehao, Zheng, Haoxin, Zhang, Yasi, Wu, Ying Nian, Gao, Ruiqi

arXiv.org Machine LearningSep-13-2024

Recent studies demonstrate that diffusion models can serve as a strong prior for solving inverse problems. A prominent example is Diffusion Posterior Sampling (DPS), which approximates the posterior distribution of data given the measure using Tweedie's formula. Despite the merits of being versatile in solving various inverse problems without re-training, the performance of DPS is hindered by the fact that this posterior approximation can be inaccurate especially for high noise levels. Therefore, we propose \textbf{D}iffusion \textbf{P}osterior \textbf{MC}MC (\textbf{DPMC}), a novel inference algorithm based on Annealed MCMC to solve inverse problems with pretrained diffusion models. We define a series of intermediate distributions inspired by the approximated conditional distributions used by DPS. Through annealed MCMC sampling, we encourage the samples to follow each intermediate distribution more closely before moving to the next distribution at a lower noise level, and therefore reduce the accumulated error along the path. We test our algorithm in various inverse problems, including super resolution, Gaussian deblurring, motion deblurring, inpainting, and phase retrieval. Our algorithm outperforms DPS with less number of evaluations across nearly all tasks, and is competitive among existing approaches.

artificial intelligence, inverse problem, machine learning, (16 more...)

arXiv.org Machine Learning

2409.08551

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)

Add feedback

Diffusion Mechanism in Residual Neural Network: Theory and Applications

Wang, Tangjun, Dou, Zehao, Bao, Chenglong, Shi, Zuoqiang

arXiv.org Artificial IntelligenceApr-28-2023

Diffusion, a fundamental internal mechanism emerging in many physical processes, describes the interaction among different objects. In many learning tasks with limited training samples, the diffusion connects the labeled and unlabeled data points and is a critical component for achieving high classification accuracy. Many existing deep learning approaches directly impose the fusion loss when training neural networks. In this work, inspired by the convection-diffusion ordinary differential equations (ODEs), we propose a novel diffusion residual network (Diff-ResNet), internally introduces diffusion into the architectures of neural networks. Under the structured data assumption, it is proved that the proposed diffusion block can increase the distance-diameter ratio that improves the separability of inter-class points and reduces the distance among local intra-class points. Moreover, this property can be easily adopted by the residual networks for constructing the separable hyperplanes. Extensive experiments of synthetic binary classification, semi-supervised graph node classification and few-shot image classification in various datasets validate the effectiveness of the proposed method.

artificial intelligence, diffusion, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2105.03155

Country:

Asia (0.28)
North America > United States (0.28)

Genre:

Research Report (0.64)
Workflow (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning Narrow One-Hidden-Layer ReLU Networks

Chen, Sitan, Dou, Zehao, Goel, Surbhi, Klivans, Adam R, Meka, Raghu

arXiv.org Artificial IntelligenceApr-20-2023

We consider the well-studied problem of learning a linear combination of $k$ ReLU activations with respect to a Gaussian distribution on inputs in $d$ dimensions. We give the first polynomial-time algorithm that succeeds whenever $k$ is a constant. All prior polynomial-time learners require additional assumptions on the network, such as positive combining coefficients or the matrix of hidden weight vectors being well-conditioned. Our approach is based on analyzing random contractions of higher-order moment tensors. We use a multi-scale analysis to argue that sufficiently close neurons can be collapsed together, sidestepping the conditioning issues present in prior work. This allows us to design an iterative procedure to discover individual neurons.

artificial intelligence, machine learning, null, (19 more...)

arXiv.org Artificial Intelligence

2304.10524

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Understanding Value Decomposition Algorithms in Deep Cooperative Multi-Agent Reinforcement Learning

Dou, Zehao, Kuba, Jakub Grudzien, Yang, Yaodong

arXiv.org Machine LearningFeb-15-2022

Value function decomposition is becoming a popular rule of thumb for scaling up multi-agent reinforcement learning (MARL) in cooperative games. For such a decomposition rule to hold, the assumption of the individual-global max (IGM) principle must be made; that is, the local maxima on the decomposed value function per every agent must amount to the global maximum on the joint value function. This principle, however, does not have to hold in general. As a result, the applicability of value decomposition algorithms is concealed and their corresponding convergence properties remain unknown. In this paper, we make the first effort to answer these questions. Specifically, we introduce the set of cooperative games in which the value decomposition methods find their validity, which is referred as decomposable games. In decomposable games, we theoretically prove that applying the multi-agent fitted Q-Iteration algorithm (MA-FQI) will lead to an optimal Q-function. In non-decomposable games, the estimated Q-function by MA-FQI can still converge to the optimum under the circumstance that the Q-function needs projecting into the decomposable function space at each iteration. In both settings, we consider value function representations by practical deep neural networks and derive their corresponding convergence rates. To summarize, our results, for the first time, offer theoretical insights for MARL practitioners in terms of when value decomposition algorithms converge and why they perform well.

artificial intelligence, deep cooperative multi-agent reinforcement learning, machine learning, (2 more...)

arXiv.org Machine Learning

2202.04868

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.40)

Add feedback

On the One-sided Convergence of Adam-type Algorithms in Non-convex Non-concave Min-max Optimization

Dou, Zehao, Li, Yuanzhi

arXiv.org Machine LearningSep-29-2021

Adam-type methods, the extension of adaptive gradient methods, have shown great performance in the training of both supervised and unsupervised machine learning models. In particular, Adam-type optimizers have been widely used empirically as the default tool for training generative adversarial networks (GANs). On the theory side, however, despite the existence of theoretical results showing the efficiency of Adam-type methods in minimization problems, the reason of their wonderful performance still remains absent in GAN's training. In existing works, the fast convergence has long been considered as one of the most important reasons and multiple works have been proposed to give a theoretical guarantee of the convergence to a critical point of min-max optimization algorithms under certain assumptions. In this paper, we firstly argue empirically that in GAN's training, Adam does not converge to a critical point even upon successful training: Only the generator is converging while the discriminator's gradient norm remains high throughout the training. We name this one-sided convergence. Then we bridge the gap between experiments and theory by showing that Adam-type algorithms provably converge to a one-sided first order stationary points in min-max optimization problems under the one-sided MVI condition. We also empirically verify that such one-sided MVI condition is satisfied for standard GANs after trained over standard data sets. To the best of our knowledge, this is the very first result which provides an empirical observation and a strict theoretical guarantee on the one-sided convergence of Adam-type algorithms in min-max optimization.

artificial intelligence, machine learning, optimization problem, (17 more...)

arXiv.org Machine Learning

2109.14213

Country: Europe (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Add feedback

Gap-Dependent Bounds for Two-Player Markov Games

Dou, Zehao, Yang, Zhuoran, Wang, Zhaoran, Du, Simon S.

arXiv.org Machine LearningJul-1-2021

As one of the most popular methods in the field of reinforcement learning, Q-learning has received increasing attention. Recently, there have been more theoretical works on the regret bound of algorithms that belong to the Q-learning class in different settings. In this paper, we analyze the cumulative regret when conducting Nash Q-learning algorithm on 2-player turn-based stochastic Markov games (2-TBSG), and propose the very first gap dependent logarithmic upper bounds in the episodic tabular setting. This bound matches the theoretical lower bound only up to a logarithmic term. Furthermore, we extend the conclusion to the discounted game setting with infinite horizon and propose a similar gap dependent logarithmic regret bound. Also, under the linear MDP assumption, we obtain another logarithmic regret for 2-TBSG, in both centralized and independent settings.

artificial intelligence, null, reinforcement learning, (15 more...)

arXiv.org Machine Learning

2107.00685

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Mathematical Analysis of Adversarial Attacks

Dou, Zehao, Osher, Stanley J., Wang, Bao

arXiv.org Machine LearningNov-15-2018

In this paper, we analyze efficacy of the fast gradient sign method (FGSM) and the Carlini-Wagner's L2 (CW-L2) attack. We prove that, within a certain regime, the untargeted FGSM can fool any convolutional neural nets (CNNs) with ReLU activation; the targeted FGSM can mislead any CNNs with ReLU activation to classify any given image into any prescribed class. For a special two-layer neural network: a linear layer followed by the softmax output activation, we show that the CW-L2 attack increases the ratio of the classification probability between the target and ground truth classes. Moreover, we provide numerical results to verify all our theoretical results.

deep learning, exp, neural network, (17 more...)

arXiv.org Machine Learning

1811.06492

Country: North America > United States (0.93)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (0.87)
Government > Military (0.86)
Government > Regional Government > North America Government > United States Government (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback