AITopics | Wang, Zihao

Collaborating Authors

Wang, Zihao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Transforming and Combining Rewards for Aligning Large Language Models

Wang, Zihao, Nagpal, Chirag, Berant, Jonathan, Eisenstein, Jacob, D'Amour, Alex, Koyejo, Sanmi, Veitch, Victor

arXiv.org Artificial IntelligenceFeb-1-2024

A common approach for aligning language models to human preferences is to first learn a reward model from preference data, and then use this reward model to update the language model. We study two closely related problems that arise in this approach. First, any monotone transformation of the reward model preserves preference ranking; is there a choice that is ``better'' than others? Second, we often wish to align language models to multiple properties: how should we combine multiple reward models? Using a probabilistic interpretation of the alignment procedure, we identify a natural choice for transformation for (the common case of) rewards learned from Bradley-Terry preference models. This derived transformation has two important properties. First, it emphasizes improving poorly-performing outputs, rather than outputs that already score well. This mitigates both underfitting (where some prompts are not improved) and reward hacking (where the model learns to exploit misspecification of the reward model). Second, it enables principled aggregation of rewards by linking summation to logical conjunction: the sum of transformed rewards corresponds to the probability that the output is ``good'' in all measured properties, in a sense we make precise. Experiments aligning language models to be both helpful and harmless using RLHF show substantial improvements over the baseline (non-transformed) approach.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2402.00742

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Input Convex Lipschitz RNN: A Fast and Robust Approach for Engineering Tasks

Wang, Zihao, Pravin, P S, Wu, Zhe

arXiv.org Artificial IntelligenceJan-19-2024

Computational efficiency and adversarial robustness are critical factors in real-world engineering applications. Yet, conventional neural networks often fall short in addressing both simultaneously, or even separately. Drawing insights from natural physical systems and existing literature, it is known that an input convex architecture enhances computational efficiency, while a Lipschitz-constrained architecture bolsters adversarial robustness. By leveraging the strengths of convexity and Lipschitz continuity, we develop a novel network architecture, termed Input Convex Lipschitz Recurrent Neural Networks. This model outperforms existing recurrent units across a spectrum of engineering tasks in terms of computational efficiency and adversarial robustness. These tasks encompass a benchmark MNIST image classification, real-world solar irradiance prediction for Solar PV system planning at LHT Holdings in Singapore, and real-time Model Predictive Control optimization for a chemical reactor.

artificial intelligence, deep learning, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2401.07494

Country: Asia > Singapore (0.34)

Genre: Research Report (1.00)

Industry: Energy > Renewable > Solar (1.00)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Input Convex LSTM: A Convex Approach for Fast Lyapunov-Based Model Predictive Control

Wang, Zihao, Wu, Zhe

arXiv.org Artificial IntelligenceJan-19-2024

Leveraging Input Convex Neural Networks (ICNNs), ICNN-based Model Predictive Control (MPC) successfully attains globally optimal solutions by upholding convexity within the MPC framework. However, current ICNN architectures encounter the issue of vanishing/exploding gradients, which limits their ability to serve as deep neural networks for complex tasks. Additionally, the current neural network-based MPC, including conventional neural network-based MPC and ICNN-based MPC, faces slower convergence speed when compared to MPC based on first-principles models. In this study, we leverage the principles of ICNNs to propose a novel Input Convex LSTM for Lyapunov-based MPC, with the specific goal of reducing convergence time and mitigating the vanishing/exploding gradient problem while ensuring closed-loop stability. From a simulation study of a nonlinear chemical reactor, we observed a mitigation of vanishing/exploding gradient problem and a reduction in convergence time, with a percentage decrease of 46.7%, 31.3%, and 20.2% compared to baseline plain RNN, plain LSTM, and Input Convex Recurrent Neural Networks, respectively.

artificial intelligence, convex, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2311.07202

Country: Asia > Singapore (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Energy > Oil & Gas > Upstream (0.72)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ProAgent: Building Proactive Cooperative Agents with Large Language Models

Zhang, Ceyao, Yang, Kaijie, Hu, Siyi, Wang, Zihao, Li, Guanghe, Sun, Yihang, Zhang, Cheng, Zhang, Zhaowei, Liu, Anji, Zhu, Song-Chun, Chang, Xiaojun, Zhang, Junge, Yin, Feng, Liang, Yitao, Yang, Yaodong

arXiv.org Artificial IntelligenceJan-11-2024

Building agents with adaptive behavior in cooperative tasks stands as a paramount goal in the realm of multi-agent systems. Current approaches to developing cooperative agents rely primarily on learning-based methods, whose policy generalization depends heavily on the diversity of teammates they interact with during the training phase. Such reliance, however, constrains the agents' capacity for strategic adaptation when cooperating with unfamiliar teammates, which becomes a significant challenge in zero-shot coordination scenarios. To address this challenge, we propose ProAgent, a novel framework that harnesses large language models (LLMs) to create proactive agents capable of dynamically adapting their behavior to enhance cooperation with teammates. ProAgent can analyze the present state, and infer the intentions of teammates from observations. It then updates its beliefs in alignment with the teammates' subsequent actual behaviors. Moreover, ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various of coordination scenarios. Experimental evaluations conducted within the Overcooked-AI environment unveil the remarkable performance superiority of ProAgent, outperforming five methods based on self-play and population-based training when cooperating with AI agents. Furthermore, in partnered with human proxy models, its performance exhibits an average improvement exceeding 10% compared to the current state-of-the-art method. For more information about our project, please visit~\url{https://pku-proagent.github.io}.

large language model, natural language, proagent, (18 more...)

arXiv.org Artificial Intelligence

2308.11339

Country: Asia > China (0.46)

Genre:

Research Report (0.84)
Press Release (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Exploring Large Language Model based Intelligent Agents: Definitions, Methods, and Prospects

Cheng, Yuheng, Zhang, Ceyao, Zhang, Zhengwen, Meng, Xiangrui, Hong, Sirui, Li, Wenhao, Wang, Zihao, Wang, Zekai, Yin, Feng, Zhao, Junhua, He, Xiuqiang

arXiv.org Artificial IntelligenceJan-7-2024

Intelligent agents stand out as a potential path toward artificial general intelligence (AGI). Thus, researchers have dedicated significant effort to diverse implementations for them. Benefiting from recent progress in large language models (LLMs), LLM-based agents that use universal natural language as an interface exhibit robust generalization capabilities across various applications -- from serving as autonomous general-purpose task assistants to applications in coding, social, and economic domains, LLM-based agents offer extensive exploration opportunities. This paper surveys current research to provide an in-depth overview of LLM-based intelligent agents within single-agent and multi-agent systems. It covers their definitions, research frameworks, and foundational components such as their composition, cognitive and planning methods, tool utilization, and responses to environmental feedback. We also delve into the mechanisms of deploying LLM-based agents in multi-agent systems, including multi-role collaboration, message passing, and strategies to alleviate communication issues between agents. The discussions also shed light on popular datasets and application scenarios. We conclude by envisioning prospects for LLM-based agents, considering the evolving landscape of AI and natural language processing.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2401.03428

Country:

Asia (0.92)
North America > United States (0.67)
Europe > United Kingdom > England (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)
Research Report > Experimental Study (0.67)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Law (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(8 more...)

Add feedback

NestE: Modeling Nested Relational Structures for Knowledge Graph Reasoning

Xiong, Bo, Nayyeri, Mojtaba, Luo, Linhao, Wang, Zihao, Pan, Shirui, Staab, Steffen

arXiv.org Artificial IntelligenceDec-14-2023

Reasoning with knowledge graphs (KGs) has primarily focused on triple-shaped facts. Recent advancements have been explored to enhance the semantics of these facts by incorporating more potent representations, such as hyper-relational facts. However, these approaches are limited to \emph{atomic facts}, which describe a single piece of information. This paper extends beyond \emph{atomic facts} and delves into \emph{nested facts}, represented by quoted triples where subjects and objects are triples themselves (e.g., ((\emph{BarackObama}, \emph{holds\_position}, \emph{President}), \emph{succeed\_by}, (\emph{DonaldTrump}, \emph{holds\_position}, \emph{President}))). These nested facts enable the expression of complex semantics like \emph{situations} over time and \emph{logical patterns} over entities and relations. In response, we introduce NestE, a novel KG embedding approach that captures the semantics of both atomic and nested factual knowledge. NestE represents each atomic fact as a $1\times3$ matrix, and each nested relation is modeled as a $3\times3$ matrix that rotates the $1\times3$ atomic fact matrix through matrix multiplication. Each element of the matrix is represented as a complex number in the generalized 4D hypercomplex space, including (spherical) quaternions, hyperbolic quaternions, and split-quaternions. Through thorough analysis, we demonstrate the embedding's efficacy in capturing diverse logical patterns over nested facts, surpassing the confines of first-order logic-like expressions. Our experimental results showcase NestE's significant performance gains over current baselines in triple prediction and conditional link prediction. The code and pre-trained models are open available at https://github.com/xiongbo010/NestE.

artificial intelligence, nested fact, relation, (16 more...)

arXiv.org Artificial Intelligence

2312.09219

Country:

North America > United States (1.00)
Europe > United Kingdom > England (0.28)

Genre: Research Report (0.82)

Industry:

Media > Music (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Leisure & Entertainment > Sports > Soccer (0.68)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.62)

Add feedback

Towards out-of-distribution generalizable predictions of chemical kinetics properties

Wang, Zihao, Chen, Yongqiang, Duan, Yang, Li, Weijiang, Han, Bo, Cheng, James, Tong, Hanghang

arXiv.org Artificial IntelligenceDec-4-2023

Machine Learning (ML) techniques have found applications in estimating chemical kinetic properties. With the accumulated drug molecules identified through "AI4drug discovery", the next imperative lies in AI-driven design for high-throughput chemical synthesis processes, with the estimation of properties of unseen reactions with unexplored molecules. To this end, the existing ML approaches for kinetics property prediction are required to be Out-Of-Distribution (OOD) generalizable. In this paper, we categorize the OOD kinetic property prediction into three levels (structure, condition, and mechanism), revealing unique aspects of such problems. Under this framework, we create comprehensive datasets to benchmark (1) the state-of-the-art ML approaches for reaction prediction in the OOD setting and (2) the state-of-the-art graph OOD methods in kinetics property prediction problems. Our results demonstrated the challenges and opportunities in OOD kinetics property prediction. Our datasets and benchmarks can further support research in this direction.

artificial intelligence, machine learning, reaction, (15 more...)

arXiv.org Artificial Intelligence

2310.03152

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models

Wang, Zihao, Cai, Shaofei, Liu, Anji, Jin, Yonggang, Hou, Jinbing, Zhang, Bowei, Lin, Haowei, He, Zhaofeng, Zheng, Zilong, Yang, Yaodong, Ma, Xiaojian, Liang, Yitao

arXiv.org Artificial IntelligenceNov-30-2023

Achieving human-like planning and control with multimodal observations in an open world is a key milestone for more functional generalist agents. Existing approaches can handle certain long-horizon tasks in an open world. However, they still struggle when the number of open-world tasks could potentially be infinite and lack the capability to progressively enhance task completion as game time progresses. We introduce JARVIS-1, an open-world agent that can perceive multimodal input (visual observations and human instructions), generate sophisticated plans, and perform embodied control, all within the popular yet challenging open-world Minecraft universe. Specifically, we develop JARVIS-1 on top of pre-trained multimodal language models, which map visual observations and textual instructions to plans. The plans will be ultimately dispatched to the goal-conditioned controllers. We outfit JARVIS-1 with a multimodal memory, which facilitates planning using both pre-trained knowledge and its actual game survival experiences. JARVIS-1 is the existing most general agent in Minecraft, capable of completing over 200 different tasks using control and observation space similar to humans. These tasks range from short-horizon tasks, e.g., "chopping trees" to long-horizon tasks, e.g., "obtaining a diamond pickaxe". JARVIS-1 performs exceptionally well in short-horizon tasks, achieving nearly perfect performance. In the classic long-term task of $\texttt{ObtainDiamondPickaxe}$, JARVIS-1 surpasses the reliability of current state-of-the-art agents by 5 times and can successfully complete longer-horizon and more challenging tasks. The project page is available at https://craftjarvis.org/JARVIS-1

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2311.05997

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry:

Materials > Metals & Mining (0.93)
Leisure & Entertainment > Games (0.73)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

GROOT: Learning to Follow Instructions by Watching Gameplay Videos

Cai, Shaofei, Zhang, Bowei, Wang, Zihao, Ma, Xiaojian, Liu, Anji, Liang, Yitao

arXiv.org Artificial IntelligenceNov-28-2023

We study the problem of building a controller that can follow open-ended instructions in open-world environments. We propose to follow reference videos as instructions, which offer expressive goal specifications while eliminating the need for expensive text-gameplay annotations. A new learning framework is derived to allow learning such instruction-following controllers from gameplay videos while producing a video instruction encoder that induces a structured goal space. We implement our agent GROOT in a simple yet effective encoder-decoder architecture based on causal transformers. We evaluate GROOT against open-world counterparts and human players on a proposed Minecraft SkillForge benchmark. The Elo ratings clearly show that GROOT is closing the human-machine gap as well as exhibiting a 70% winning rate over the best generalist agent baseline. Qualitative analysis of the induced goal space further demonstrates some interesting emergent properties, including the goal composition and complex gameplay behavior synthesis. The project page is available at https://craftjarvis-groot.github.io.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2310.08235

Genre: Research Report (1.00)

Industry:

Materials > Metals & Mining (1.00)
Leisure & Entertainment > Games > Computer Games (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Robots (0.93)

Add feedback

The Local Landscape of Phase Retrieval Under Limited Samples

Liu, Kaizhao, Wang, Zihao, Wu, Lei

arXiv.org Machine LearningNov-26-2023

In this paper, we provide a fine-grained analysis of the local landscape of phase retrieval under the regime with limited samples. Our aim is to ascertain the minimal sample size necessary to guarantee a benign local landscape surrounding global minima in high dimensions. Let $n$ and $d$ denote the sample size and input dimension, respectively. We first explore the local convexity and establish that when $n=o(d\log d)$, for almost every fixed point in the local ball, the Hessian matrix must have negative eigenvalues as long as $d$ is sufficiently large. Consequently, the local landscape is highly non-convex. We next consider the one-point strong convexity and show that as long as $n=\omega(d)$, with high probability, the landscape is one-point strongly convex in the local annulus: $\{w\in\mathbb{R}^d: o_d(1)\leqslant \|w-w^*\|\leqslant c\}$, where $w^*$ is the ground truth and $c$ is an absolute constant. This implies that gradient descent initialized from any point in this domain can converge to an $o_d(1)$-loss solution exponentially fast. Furthermore, we show that when $n=o(d\log d)$, there is a radius of $\widetilde\Theta\left(\sqrt{1/d}\right)$ such that one-point convexity breaks in the corresponding smaller local ball. This indicates an impossibility to establish a convergence to exact $w^*$ for gradient descent under limited samples by relying solely on one-point convexity.

artificial intelligence, landscape, machine learning, (17 more...)

arXiv.org Machine Learning

2311.15221

Country: Asia (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)

Add feedback