AITopics | Zhou, Zhiyuan

Collaborating Authors

Zhou, Zhiyuan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AutoEval: Autonomous Evaluation of Generalist Robot Manipulation Policies in the Real World

Zhou, Zhiyuan, Atreya, Pranav, Tan, You Liang, Pertsch, Karl, Levine, Sergey

arXiv.org Artificial IntelligenceApr-2-2025

Scalable and reproducible policy evaluation has been a long-standing challenge in robot learning. Evaluations are critical to assess progress and build better policies, but evaluation in the real world, especially at a scale that would provide statistically reliable results, is costly in terms of human time and hard to obtain. Evaluation of increasingly generalist robot policies requires an increasingly diverse repertoire of evaluation environments, making the evaluation bottleneck even more pronounced. To make real-world evaluation of robotic policies more practical, we propose AutoEval, a system to autonomously evaluate generalist robot policies around the clock with minimal human intervention. Users interact with AutoEval by submitting evaluation jobs to the AutoEval queue, much like how software jobs are submitted with a cluster scheduling system, and AutoEval will schedule the policies for evaluation within a framework supplying automatic success detection and automatic scene resets. We show that AutoEval can nearly fully eliminate human involvement in the evaluation process, permitting around the clock evaluations, and the evaluation results correspond closely to ground truth evaluations conducted by hand. To facilitate the evaluation of generalist policies in the robotics community, we provide public access to multiple AutoEval scenes in the popular BridgeData robot setup with WidowX robot arms. In the future, we hope that AutoEval scenes can be set up across institutions to form a diverse and distributed evaluation network.

artificial intelligence, autoeval, evaluation, (11 more...)

arXiv.org Artificial Intelligence

2503.24278

Country: Asia (0.28)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (0.52)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.34)

Add feedback

SafeDrive: Knowledge- and Data-Driven Risk-Sensitive Decision-Making for Autonomous Vehicles with Large Language Models

Zhou, Zhiyuan, Huang, Heye, Li, Boqi, Zhao, Shiyue, Mu, Yao, Wang, Jianqiang

arXiv.org Artificial IntelligenceDec-18-2024

Recent advancements in autonomous vehicles (AVs) use Large Language Models (LLMs) to perform well in normal driving scenarios. However, ensuring safety in dynamic, high-risk environments and managing safety-critical long-tail events remain significant challenges. To address these issues, we propose SafeDrive, a knowledge- and data-driven risk-sensitive decision-making framework to enhance AV safety and adaptability. The proposed framework introduces a modular system comprising: (1) a Risk Module for quantifying multi-factor coupled risks involving driver, vehicle, and road interactions; (2) a Memory Module for storing and retrieving typical scenarios to improve adaptability; (3) a LLM-powered Reasoning Module for context-aware safety decision-making; and (4) a Reflection Module for refining decisions through iterative learning. By integrating knowledge-driven insights with adaptive learning mechanisms, the framework ensures robust decision-making under uncertain conditions. Extensive evaluations on real-world traffic datasets, including highways (HighD), intersections (InD), and roundabouts (RounD), validate the framework's ability to enhance decision-making safety (achieving a 100% safety rate), replicate human-like driving behaviors (with decision alignment exceeding 85%), and adapt effectively to unpredictable scenarios. SafeDrive establishes a novel paradigm for integrating knowledge- and data-driven methods, highlighting significant potential to improve safety and adaptability of autonomous driving in high-risk traffic scenarios. Project Page: https://mezzi33.github.io/SafeDrive/

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.13238

Country:

North America > United States > Wisconsin (0.28)
North America > United States > Michigan (0.28)

Genre: Research Report (0.82)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data

Zhou, Zhiyuan, Peng, Andy, Li, Qiyang, Levine, Sergey, Kumar, Aviral

arXiv.org Artificial IntelligenceDec-11-2024

The predominant paradigm for learning at scale today involves pre-training models on diverse prior data, and then fine-tuning them on narrower domain-specific data to specialize them to particular downstream tasks [7, 4, 9, 37, 55, 50, 59]. In the context of learning decision-making policies, this paradigm translates to pre-training on a large amount of previously collected static experience via offline reinforcement learning (RL), followed by fine-tuning these initializations via online RL efficiently. Generally, this fine-tuning is done by continuing training with the very same offline RL algorithm, e.g., pessimistic [28, 6] algorithms or algorithms that apply behavioral constraints [14, 27], on a mixture of offline data and autonomous online data, with minor modifications to the offline RL algorithm itself [33]. While this paradigm has led to promising results [27, 33], RL fine-tuning requires continued training on offline data for stability and performance ([56, 57]; Section 3), as opposed to the standard practice in machine learning. Retaining offline data is problematic for several reasons. First, as offline datasets grow in size and diversity, continued online training on offline data becomes inefficient and expensive, and such computation requirements may even deter practitioners from using online RL for fine-tuning. Second, the need for retaining offline data perhaps defeats the point of offline RL pre-training altogether: recent results [47], corroborated by our experiments in Section 3, indicate that current fine-tuning approaches are not able to make good use of several strong offline RL value and/or policy initializations, as shown by the superior performance of running online RL from scratch with offline data put in the replay buffer [3]. These problems put the efficacy of current RL fine-tuning approaches into question. In this paper, we aim to understand and address the aforementioned shortcomings of current online finetuning methods and build an online RL approach that does not retain offline data.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

arXiv.org Artificial Intelligence

2412.07762

Country: North America > United States (0.28)

Genre:

Research Report (0.64)
Instructional Material > Online (0.43)

Industry: Education > Educational Setting > Online (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Proprioceptive State Estimation for Amphibious Tactile Sensing

Guo, Ning, Han, Xudong, Zhong, Shuqiao, Zhou, Zhiyuan, Lin, Jian, Dai, Jian S., Wan, Fang, Song, Chaoyang

arXiv.org Artificial IntelligenceDec-15-2023

This paper presents a novel vision-based proprioception approach for a soft robotic finger capable of estimating and reconstructing tactile interactions in terrestrial and aquatic environments. The key to this system lies in the finger's unique metamaterial structure, which facilitates omni-directional passive adaptation during grasping, protecting delicate objects across diverse scenarios. A compact in-finger camera captures high-framerate images of the finger's deformation during contact, extracting crucial tactile data in real time. We present a method of the volumetric discretized model of the soft finger and use the geometry constraints captured by the camera to find the optimal estimation of the deformed shape. The approach is benchmarked with a motion-tracking system with sparse markers and a haptic device with dense measurements. Both results show state-of-the-art accuracies, with a median error of 1.96 mm for overall body deformation, corresponding to 2.1$\%$ of the finger's length. More importantly, the state estimation is robust in both on-land and underwater environments as we demonstrate its usage for underwater object shape sensing. This combination of passive adaptation and real-time tactile sensing paves the way for amphibious robotic grasping applications.

artificial intelligence, estimation, soft finger, (13 more...)

arXiv.org Artificial Intelligence

2312.09863

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)

Add feedback

Autoencoding a Soft Touch to Learn Grasping from On-land to Underwater

Guo, Ning, Han, Xudong, Liu, Xiaobo, Zhong, Shuqiao, Zhou, Zhiyuan, Lin, Jian, Dai, Jiansheng, Wan, Fang, Song, Chaoyang

arXiv.org Artificial IntelligenceAug-16-2023

Robots play a critical role as the physical agent of human operators in exploring the ocean. However, it remains challenging to grasp objects reliably while fully submerging under a highly pressurized aquatic environment with little visible light, mainly due to the fluidic interference on the tactile mechanics between the finger and object surfaces. This study investigates the transferability of grasping knowledge from on-land to underwater via a vision-based soft robotic finger that learns 6D forces and torques (FT) using a Supervised Variational Autoencoder (SVAE). A high-framerate camera captures the whole-body deformations while a soft robotic finger interacts with physical objects on-land and underwater. Results show that the trained SVAE model learned a series of latent representations of the soft mechanics transferrable from land to water, presenting a superior adaptation to the changing environments against commercial FT sensors. Soft, delicate, and reactive grasping enabled by tactile intelligence enhances the gripper's underwater interaction with improved reliability and robustness at a much-reduced cost, paving the path for learning-based intelligent grasping to support fundamental scientific discoveries in environmental and ocean research.

artificial intelligence, machine learning, soft finger, (18 more...)

arXiv.org Artificial Intelligence

2308.0851

Country: Asia > Middle East (0.28)

Genre: Research Report > New Finding (0.88)

Industry: Semiconductors & Electronics (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Specifying Behavior Preference with Tiered Reward Functions

Zhou, Zhiyuan, Sowerby, Henry, Littman, Michael L.

arXiv.org Artificial IntelligenceDec-7-2022

Reinforcement-learning agents seek to maximize a reward signal through environmental interactions. As humans, our contribution to the learning process is through designing the reward function. Like programmers, we have a behavior in mind and have to translate it into a formal specification, namely rewards. In this work, we consider the reward-design problem in tasks formulated as reaching desirable states and avoiding undesirable states. To start, we propose a strict partial ordering of the policy space. We prefer policies that reach the good states faster and with higher probability while avoiding the bad states longer. Next, we propose an environment-independent tiered reward structure and show it is guaranteed to induce policies that are Pareto-optimal according to our preference relation. Finally, we empirically evaluate tiered reward functions on several environments and show they induce desired behavior and lead to fast learning.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2212.03733

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Generalized-TODIM Method for Multi-criteria Decision Making with Basic Uncertain Information and its Application

Zhou, Zhiyuan, Xuan, Kai, Tao, Zhifu, Zhou, Ligang

arXiv.org Artificial IntelligenceApr-27-2021

Due to the fact that basic uncertain information provides a simple form for decision information with certainty degree, it has been developed to reflect the quality of observed or subjective assessments. In order to study the algebra structure and preference relation of basic uncertain information, we develop some algebra operations for basic uncertain information. The order relation of such type of information has also been considered. Finally, to apply the developed algebra operations and order relations, a generalized TODIM method for multi-attribute decision making with basic uncertain information is given. The numerical example shows that the developed decision procedure is valid.

fuzzy logic, immunology, possibility degree, (17 more...)

arXiv.org Artificial Intelligence

2104.11597

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (0.94)
Energy (0.68)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.49)

Add feedback