Wang, Zhecheng
AeroHaptix: A Wearable Vibrotactile Feedback System for Enhancing Collision Avoidance in UAV Teleoperation
Huang, Bingjian, Wang, Zhecheng, Cheng, Qilong, Ren, Siyi, Cai, Hanfeng, Valdivia, Antonio Alvarez, Mahadevan, Karthik, Wigdor, Daniel
Haptic feedback enhances collision avoidance by providing directional obstacle information to operators in unmanned aerial vehicle (UAV) teleoperation. However, such feedback is often rendered via haptic joysticks, which are unfamiliar to UAV operators and limited to single-directional force feedback. Additionally, the direct coupling of the input device and the feedback method diminishes the operators' control authority and causes oscillatory movements. To overcome these limitations, we propose AeroHaptix, a wearable haptic feedback system that uses high-resolution vibrations to communicate multiple obstacle directions simultaneously. The vibrotactile actuators' layout was optimized based on a perceptual study to eliminate perceptual biases and achieve uniform spatial coverage. A novel rendering algorithm, MultiCBF, was adapted from control barrier functions to support multi-directional feedback. System evaluation showed that AeroHaptix effectively reduced collisions in complex environment, and operators reported significantly lower physical workload, improved situational awareness, and increased control authority.
SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing
Wang, Zhecheng, Prabha, Rajanie, Huang, Tianyuan, Wu, Jiajun, Rajagopal, Ram
Remote sensing imagery, despite its broad applications in helping achieve Sustainable Development Goals and tackle climate change, has not yet benefited from the recent advancements of versatile, task-agnostic vision language models (VLMs). A key reason is that the large-scale, semantically diverse image-text dataset required for developing VLMs is still absent for remote sensing images. Unlike natural images, remote sensing images and their associated text descriptions cannot be efficiently collected from the public Internet at scale. In this work, we bridge this gap by using geo-coordinates to automatically connect open, unlabeled remote sensing images with rich semantics covered in OpenStreetMap, and thus construct SkyScript, a comprehensive vision-language dataset for remote sensing images, comprising 2.6 million image-text pairs covering 29K distinct semantic tags. With continual pre-training on this dataset, we obtain a VLM that surpasses baseline models with a 6.2% average accuracy gain in zero-shot scene classification across seven benchmark datasets. It also demonstrates the ability of zero-shot transfer for fine-grained object attribute classification and cross-modal retrieval. We hope this dataset can support the advancement of VLMs for various multi-modal tasks in remote sensing, such as open-vocabulary classification, retrieval, captioning, and text-to-image synthesis.
Image Generation With Neural Cellular Automatas
Chen, Mingxiang, Wang, Zhecheng
In this paper, we propose a novel approach to generate images (or other artworks) by using neural cellular automatas (NCAs). Rather than training NCAs based on single images one by one, we combined the idea with variational autoencoders (VAEs), and hence explored some applications, such as image restoration and style fusion. The code for model implementation is available online.
Predicting Geographic Information with Neural Cellular Automata
Chen, Mingxiang, Chen, Qichang, Gao, Lei, Chen, Yilin, Wang, Zhecheng
However, because Cellular automata (CA) is a widely used modeling theory. of the the constraint of computing power, and the limited From the perspective of physics, CA refers to a dynamic system defined in a cell space composed of cells with discrete and finite states, which evolved in discrete time dimensions according to certain local rules. Cells are the most basic component of CA which are distributed in discrete Euclidean space positions. Each cell in the lattice grid takes from a finite set of discrete states, follows the same local rules of actions, and updates simultaneously according to the rules. Other cells within the local space which may interact with the rules are defined as the "neighborhood". While the evolution for each cell only take place based on local information, a large number of cells make the evolution of the entire dynamic system happen through interactions, and hence form a dynamic effect globally. CAs are not determined by strictly defined equations or functions, but are constituted by Figure 1: Von Neumann neighborhood (red) and Moore a series of rules for constructing models. Therefore, CA is a neighborhood (blue).
Machine Learning for AC Optimal Power Flow
Guha, Neel, Wang, Zhecheng, Wytock, Matt, Majumdar, Arun
W e explore machine learning methods for AC Optimal Powerflow (ACOPF) - the task of optimizing power generation in a transmission network according while respecting physical and engineering constraints. W e present two formulations of ACOPF as a machine learning problem: 1) an end-to-end prediction task where we directly predict the optimal generator settings, and 2) a constraint prediction task where we predict the set of active constraints in the optimal solution.