AITopics | Ding, Zihan

Collaborating Authors

Ding, Zihan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Generative Diffusion Modeling: A Practical Handbook

Ding, Zihan, Jin, Chi

arXiv.org Artificial IntelligenceDec-22-2024

This handbook offers a unified perspective on diffusion models, encompassing diffusion probabilistic models, score-based generative models, consistency models, rectified flow, and related methods. By standardizing notations and aligning them with code implementations, it aims to bridge the "paper-to-code" gap and facilitate robust implementations and fair comparisons. The content encompasses the fundamentals of diffusion models, the pre-training process, and various post-training methods. Post-training techniques include model distillation and reward-based fine-tuning. Designed as a practical guide, it emphasizes clarity and usability over theoretical depth, focusing on widely adopted approaches in generative modeling with diffusion models.

artificial intelligence, machine learning, training manual, (12 more...)

arXiv.org Artificial Intelligence

2412.17162

Genre:

Research Report (0.50)
Instructional Material (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

TopV-Nav: Unlocking the Top-View Spatial Reasoning Potential of MLLM for Zero-shot Object Navigation

Zhong, Linqing, Gao, Chen, Ding, Zihan, Liao, Yue, Liu, Si

arXiv.org Artificial IntelligenceNov-25-2024

The Zero-Shot Object Navigation (ZSON) task requires embodied agents to find a previously unseen object by navigating in unfamiliar environments. Such a goal-oriented exploration heavily relies on the ability to perceive, understand, and reason based on the spatial information of the environment. However, current LLM-based approaches convert visual observations to language descriptions and reason in the linguistic space, leading to the loss of spatial information. In this paper, we introduce TopV-Nav, a MLLM-based method that directly reasons on the top-view map with complete spatial information. To fully unlock the MLLM's spatial reasoning potential in top-view perspective, we propose the Adaptive Visual Prompt Generation (AVPG) method to adaptively construct semantically-rich top-view map. It enables the agent to directly utilize spatial information contained in the top-view map to conduct thorough reasoning. Besides, we design a Dynamic Map Scaling (DMS) mechanism to dynamically zoom top-view map at preferred scales, enhancing local fine-grained reasoning. Additionally, we devise a Target-Guided Navigation (TGN) mechanism to predict and to utilize target locations, facilitating global and human-like exploration. Experiments on MP3D and HM3D benchmarks demonstrate the superiority of our TopV-Nav, e.g., $+3.9\%$ SR and $+2.0\%$ SPL absolute improvements on HM3D.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2411.16425

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

How to beat a Bayesian adversary

Ding, Zihan, Jin, Kexin, Latz, Jonas, Liu, Chenguang

arXiv.org Machine LearningJul-11-2024

Deep neural networks and other modern machine learning models are often susceptible to adversarial attacks. Indeed, an adversary may often be able to change a model's prediction through a small, directed perturbation of the model's input - an issue in safety-critical applications. Adversarially robust machine learning is usually based on a minmax optimisation problem that minimises the machine learning loss under maximisation-based adversarial attacks. In this work, we study adversaries that determine their attack using a Bayesian statistical approach rather than maximisation. The resulting Bayesian adversarial robustness problem is a relaxation of the usual minmax problem. To solve this problem, we propose Abram - a continuous-time particle system that shall approximate the gradient flow corresponding to the underlying learning problem. We show that Abram approximates a McKean-Vlasov process and justify the use of Abram by giving assumptions under which the McKean-Vlasov process finds the minimiser of the Bayesian adversarial robustness problem. We discuss two ways to discretise Abram and show its suitability in benchmark adversarial deep learning experiments.

abram, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

2407.08678

Country: North America > United States > California (0.14)

Genre: Research Report (0.41)

Industry: Information Technology > Security & Privacy (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning

Li, Wenzhe, Ding, Zihan, Karten, Seth, Jin, Chi

arXiv.org Artificial IntelligenceJun-23-2024

Recent advances in reinforcement learning (RL) heavily rely on a variety of well-designed benchmarks, which provide environmental platforms and consistent criteria to evaluate existing and novel algorithms. Specifically, in multi-agent RL (MARL), a plethora of benchmarks based on cooperative games have spurred the development of algorithms that improve the scalability of cooperative multi-agent systems. However, for the competitive setting, a lightweight and open-sourced benchmark with challenging gaming dynamics and visual inputs has not yet been established. In this work, we present FightLadder, a real-time fighting game platform, to empower competitive MARL research. Along with the platform, we provide implementations of state-of-the-art MARL algorithms for competitive games, as well as a set of evaluation metrics to characterize the performance and exploitability of agents. We demonstrate the feasibility of this platform by training a general agent that consistently defeats 12 built-in characters in single-player mode, and expose the difficulty of training a non-exploitable agent without human knowledge and demonstrations in two-player mode. FightLadder provides meticulously designed environments to address critical challenges in competitive MARL research, aiming to catalyze a new era of discovery and advancement in the field. Videos and code at https://sites.google.com/view/fightladder/home.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2406.02081

Country:

North America > United States (0.14)
Europe > France (0.14)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)

Add feedback

Constraint-Aware Diffusion Models for Trajectory Optimization

Li, Anjian, Ding, Zihan, Dieng, Adji Bousso, Beeson, Ryne

arXiv.org Artificial IntelligenceJun-3-2024

The diffusion model has shown success in generating high-quality and diverse solutions to trajectory optimization problems. However, diffusion models with neural networks inevitably make prediction errors, which leads to constraint violations such as unmet goals or collisions. This paper presents a novel constraint-aware diffusion model for trajectory optimization. We introduce a novel hybrid loss function for training that minimizes the constraint violation of diffusion samples compared to the groundtruth while recovering the original data distribution. Our model is demonstrated on tabletop manipulation and two-car reach-avoid problems, outperforming traditional diffusion models in minimizing constraint violations while generating samples close to locally optimal solutions.

artificial intelligence, diffusion model, machine learning, (10 more...)

arXiv.org Artificial Intelligence

2406.0099

Country: Europe > Germany (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.93)

Add feedback

Combining Constrained Diffusion Models and Numerical Solvers for Efficient and Robust Non-Convex Trajectory Optimization

Li, Anjian, Ding, Zihan, Dieng, Adji Bousso, Beeson, Ryne

arXiv.org Artificial IntelligenceMay-26-2024

Motivated by the need to solve open-loop optimal control problems with computational efficiency and reliable constraint satisfaction, we introduce a general framework that combines diffusion models and numerical optimization solvers. Optimal control problems are rarely solvable in closed form, hence they are often transcribed into numerical trajectory optimization problems, which then require initial guesses. These initial guesses are supplied in our framework by diffusion models. To mitigate the effect of samples that violate the problem constraints, we develop a novel constrained diffusion model to approximate the true distribution of locally optimal solutions with an additional constraint violation loss in training. To further enhance the robustness, the diffusion samples as initial guesses are fed to the numerical solver to refine and derive final optimal (and hence feasible) solutions. Experimental evaluations on three tasks verify the improved constraint satisfaction and computational efficiency with 4$\times$ to 30$\times$ acceleration using our proposed framework, which generalizes across trajectory optimization problems and scales well with problem complexity.

artificial intelligence, diffusion model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2403.05571

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Diffusion World Model

Ding, Zihan, Zhang, Amy, Tian, Yuandong, Zheng, Qinqing

arXiv.org Artificial IntelligenceFeb-11-2024

We introduce Diffusion World Model (DWM), a conditional diffusion model capable of predicting multistep future states and rewards concurrently. As opposed to traditional one-step dynamics models, DWM offers long-horizon predictions in a single forward pass, eliminating the need for recursive queries. We integrate DWM into model-based value estimation, where the short-term return is simulated by future trajectories sampled from DWM. In the context of offline reinforcement learning, DWM can be viewed as a conservative value regularization through generative modeling. Alternatively, it can be seen as a data source that enables offline Q-learning with synthetic data. Our experiments on the D4RL dataset confirm the robustness of DWM to long-horizon simulation. In terms of absolute performance, DWM significantly surpasses one-step dynamics models with a $44\%$ performance gain, and achieves state-of-the-art performance.

artificial intelligence, machine learning, reinforcement learning, (11 more...)

arXiv.org Artificial Intelligence

2402.0357

Country:

North America > United States > Texas (0.14)
Europe > Germany (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Consistency Models as a Rich and Efficient Policy Class for Reinforcement Learning

Ding, Zihan, Jin, Chi

arXiv.org Artificial IntelligenceSep-29-2023

Score-based generative models like the diffusion model have been testified to be effective in modeling multi-modal data from image generation to reinforcement learning (RL). However, the inference process of diffusion model can be slow, which hinders its usage in RL with iterative sampling. We propose to apply the consistency model as an efficient yet expressive policy representation, namely consistency policy, with an actor-critic style algorithm for three typical RL settings: offline, offline-to-online and online. For offline RL, we demonstrate the expressiveness of generative models as policies from multi-modal data. For offline-to-online RL, the consistency policy is shown to be more computational efficient than diffusion policy, with a comparable performance. For online RL, the consistency policy demonstrates significant speedup and even higher average performances than the diffusion policy.

consistency-ac, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2309.16984

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Survey of Consciousness Theory from Computational Perspective

Ding, Zihan, Wei, Xiaoxi, Xu, Yidan

arXiv.org Artificial IntelligenceSep-18-2023

Human consciousness has been a long-lasting mystery for centuries, while machine intelligence and consciousness is an arduous pursuit. Researchers have developed diverse theories for interpreting the consciousness phenomenon in human brains from different perspectives and levels. This paper surveys several main branches of consciousness theories originating from different subjects including information theory, quantum physics, cognitive psychology, physiology and computer science, with the aim of bridging these theories from a computational perspective. It also discusses the existing evaluation metrics of consciousness and possibility for current computational models to be conscious. Breaking the mystery of consciousness can be an essential step in building general artificial intelligence with computing machines.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2309.10063

Country: North America > United States (1.00)

Genre:

Overview (1.00)
Research Report > New Finding (0.92)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Sleep (0.93)
Health & Medicine > Diagnostic Medicine (0.92)
Health & Medicine > Health Care Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Philosophy (1.00)
(2 more...)

Add feedback

Learning a Universal Human Prior for Dexterous Manipulation from Human Preference

Ding, Zihan, Chen, Yuanpei, Ren, Allen Z., Gu, Shixiang Shane, Wang, Qianxu, Dong, Hao, Jin, Chi

arXiv.org Artificial IntelligenceSep-13-2023

Generating human-like behavior on robots is a great challenge especially in dexterous manipulation tasks with robotic hands. Scripting policies from scratch is intractable due to the high-dimensional control space, and training policies with reinforcement learning (RL) and manual reward engineering can also be hard and lead to unnatural motions. Leveraging the recent progress on RL from Human Feedback, we propose a framework that learns a universal human prior using direct human preference feedback over videos, for efficiently tuning the RL policies on 20 dual-hand robot manipulation tasks in simulation, without a single human demonstration. A task-agnostic reward model is trained through iteratively generating diverse polices and collecting human preference over the trajectories; it is then applied for regularizing the behavior of polices in the fine-tuning stage. Our method empirically demonstrates more human-like behaviors on robot hands in diverse tasks including even unseen tasks, indicating its generalization capability.

artificial intelligence, machine learning, trajectory, (17 more...)

arXiv.org Artificial Intelligence

2304.04602

Genre: Research Report (0.82)

Industry:

Energy (0.46)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback