AITopics | Wang, Haoyang

Collaborating Authors

Wang, Haoyang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Mobile Sensing with Event Cameras on High-agility Resource-constrained Devices: A Survey

Wang, Haoyang, Guo, Ruishan, Ma, Pengtao, Ruan, Ciyu, Luo, Xinyu, Ding, Wenhua, Zhong, Tianyang, Xu, Jingao, Liu, Yunhao, Chen, Xinlei

arXiv.org Artificial IntelligenceApr-3-2025

With the increasing complexity of mobile device applications, these devices are evolving toward high agility. This shift imposes new demands on mobile sensing, particularly in terms of achieving high accuracy and low latency. Event-based vision has emerged as a disruptive paradigm, offering high temporal resolution, low latency, and energy efficiency, making it well-suited for high-accuracy and low-latency sensing tasks on high-agility platforms. However, the presence of substantial noisy events, the lack of inherent semantic information, and the large data volume pose significant challenges for event-based data processing on resource-constrained mobile devices. This paper surveys the literature over the period 2014-2024, provides a comprehensive overview of event-based mobile sensing systems, covering fundamental principles, event abstraction methods, algorithmic advancements, hardware and software acceleration strategies. We also discuss key applications of event cameras in mobile sensing, including visual odometry, object tracking, optical flow estimation, and 3D reconstruction, while highlighting the challenges associated with event data processing, sensor fusion, and real-time deployment. Furthermore, we outline future research directions, such as improving event camera hardware with advanced optics, leveraging neuromorphic computing for efficient processing, and integrating bio-inspired algorithms to enhance perception. To support ongoing research, we provide an open-source \textit{Online Sheet} with curated resources and recent developments. We hope this survey serves as a valuable reference, facilitating the adoption of event-based vision across diverse applications.

event camera, machine learning, real time system, (24 more...)

arXiv.org Artificial Intelligence

2503.22943

Country:

Europe (0.92)
Asia > China (0.28)

Genre: Overview (1.00)

Industry:

Semiconductors & Electronics (1.00)
Information Technology > Software (0.54)
Transportation > Ground (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Hardware (1.00)
Information Technology > Communications > Networks (1.00)
(6 more...)

Add feedback

PromptMobile: Efficient Promptus for Low Bandwidth Mobile Video Streaming

Liu, Liming, Wu, Jiangkai, Wang, Haoyang, Wang, Peiheng, Zhang, Xinggong, Guo, Zongming

arXiv.org Artificial IntelligenceMar-20-2025

Traditional video compression algorithms exhibit significant quality degradation at extremely low bitrates. Promptus emerges as a new paradigm for video streaming, substantially cutting down the bandwidth essential for video streaming. However, Promptus is computationally intensive and can not run in real-time on mobile devices. This paper presents PromptMobile, an efficient acceleration framework tailored for on-device Promptus. Specifically, we propose (1) a two-stage efficient generation framework to reduce computational cost by 8.1x, (2) a fine-grained inter-frame caching to reduce redundant computations by 16.6\%, (3) system-level optimizations to further enhance efficiency. The evaluations demonstrate that compared with the original Promptus, PromptMobile achieves a 13.6x increase in image generation speed. Compared with other streaming methods, PromptMobile achives an average LPIPS improvement of 0.016 (compared with H.265), reducing 60\% of severely distorted frames (compared to VQGAN).

artificial intelligence, machine learning, promptus, (18 more...)

arXiv.org Artificial Intelligence

2503.16112

Country:

Asia > China (0.16)
Europe > Netherlands (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Adaptive Anomaly Recovery for Telemanipulation: A Diffusion Model Approach to Vision-Based Tracking

Wang, Haoyang, Guo, Haoran, Tao, Lingfeng, Li, Zhengxiong

arXiv.org Artificial IntelligenceMar-11-2025

Dexterous telemanipulation critically relies on the continuous and stable tracking of the human operator's commands to ensure robust operation. Vison-based tracking methods are widely used but have low stability due to anomalies such as occlusions, inadequate lighting, and loss of sight. Traditional filtering, regression, and interpolation methods are commonly used to compensate for explicit information such as angles and positions. These approaches are restricted to low-dimensional data and often result in information loss compared to the original high-dimensional image and video data. Recent advances in diffusion-based approaches, which can operate on high-dimensional data, have achieved remarkable success in video reconstruction and generation. However, these methods have not been fully explored in continuous control tasks in robotics. This work introduces the Diffusion-Enhanced Telemanipulation (DET) framework, which incorporates the Frame-Difference Detection (FDD) technique to identify and segment anomalies in video streams. These anomalous clips are replaced after reconstruction using diffusion models, ensuring robust telemanipulation performance under challenging visual conditions. We validated this approach in various anomaly scenarios and compared it with the baseline methods. Experiments show that DET achieves an average RMSE reduction of 17.2% compared to the cubic spline and 51.1% compared to FFT-based interpolation for different occlusion durations.

artificial intelligence, machine learning, reconstruction, (18 more...)

arXiv.org Artificial Intelligence

2503.09632

Country: North America > United States > Colorado > Denver County > Denver (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Bio-Skin: A Cost-Effective Thermostatic Tactile Sensor with Multi-Modal Force and Temperature Detection

Guo, Haoran, Wang, Haoyang, Li, Zhengxiong, Tao, Lingfeng

arXiv.org Artificial IntelligenceMar-10-2025

Tactile sensors can significantly enhance the perception of humanoid robotics systems by providing contact information that facilitates human-like interactions. However, existing commercial tactile sensors focus on improving the resolution and sensitivity of single-modal detection with high-cost components and densely integrated design, incurring complex manufacturing processes and unaffordable prices. In this work, we present Bio-Skin, a cost-effective multi-modal tactile sensor that utilizes single-axis Hall-effect sensors for planar normal force measurement and bar-shape piezo resistors for 2D shear force measurement. A thermistor coupling with a heating wire is integrated into a silicone body to achieve temperature sensation and thermostatic function analogous to human skin. We also present a cross-reference framework to validate the two modalities of the force sensing signal, improving the sensing fidelity in a complex electromagnetic environment. Bio-Skin has a multi-layer design, and each layer is manufactured sequentially and subsequently integrated, thereby offering a fast production pathway. After calibration, Bio-Skin demonstrates performance metrics-including signal-to-range ratio, sampling rate, and measurement range-comparable to current commercial products, with one-tenth of the cost. The sensor's real-world performance is evaluated using an Allegro hand in object grasping tasks, while its temperature regulation functionality was assessed in a material detection task.

artificial intelligence, bio-skin, sensor, (16 more...)

arXiv.org Artificial Intelligence

2503.07989

Country:

North America > United States > Oklahoma > Payne County > Stillwater (0.14)
North America > United States > Colorado > Denver County > Denver (0.14)

Genre: Research Report (0.64)

Industry:

Information Technology (0.46)
Energy > Oil & Gas (0.46)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Ultra-High-Frequency Harmony: mmWave Radar and Event Camera Orchestrate Accurate Drone Landing

Wang, Haoyang, Xu, Jingao, Luo, Xinyu, Chen, Xuecheng, Zhang, Ting, Duan, Ruiyang, Liu, Yunhao, Chen, Xinlei

arXiv.org Artificial IntelligenceFeb-20-2025

For precise, efficient, and safe drone landings, ground platforms should real-time, accurately locate descending drones and guide them to designated spots. While mmWave sensing combined with cameras improves localization accuracy, the lower sampling frequency of traditional frame cameras compared to mmWave radar creates bottlenecks in system throughput. In this work, we replace the traditional frame camera with event camera, a novel sensor that harmonizes in sampling frequency with mmWave radar within the ground platform setup, and introduce mmE-Loc, a high-precision, low-latency ground localization system designed for drone landings. To fully leverage the \textit{temporal consistency} and \textit{spatial complementarity} between these modalities, we propose two innovative modules, \textit{consistency-instructed collaborative tracking} and \textit{graph-informed adaptive joint optimization}, for accurate drone measurement extraction and efficient sensor fusion. Extensive real-world experiments in landing scenarios from a leading drone delivery company demonstrate that mmE-Loc outperforms state-of-the-art methods in both localization accuracy and latency.

artificial intelligence, event camera, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.14992

Country:

Asia > China (0.28)
North America > United States > New York (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry:

Transportation > Air (0.46)
Information Technology > Robotics & Automation (0.46)
Transportation > Freight & Logistics Services (0.34)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(3 more...)

Add feedback

SniffySquad: Patchiness-Aware Gas Source Localization with Multi-Robot Collaboration

Cheng, Yuhan, Chen, Xuecheng, Yang, Yixuan, Wang, Haoyang, Xu, Jingao, Hong, Chaopeng, Xu, Susu, Zhang, Xiao-Ping, Liu, Yunhao, Chen, Xinlei

arXiv.org Artificial IntelligenceNov-9-2024

Abstract--Gas source localization is pivotal for the rapid mitigation of gas leakage disasters, where mobile robots emerge as a promising solution. However, existing methods predominantly schedule robots' movements based on reactive stimuli or simplified gas plume models. These approaches typically excel in idealized, simulated environments but fall short in real-world gas environments characterized by their patchy distribution. In this work, we introduce SniffySquad, a multi-robot olfactionbased system designed to address the inherent patchiness in gas source localization. SniffySquad incorporates a patchinessaware active sensing approach that enhances the quality of data collection and estimation. Moreover, it features an innovative collaborative role adaptation strategy to boost the efficiency of source-seeking endeavors. Extensive evaluations demonstrate that our system achieves an increase in the success rate by 20%+ and an improvement in path efficiency by 30%+, outperforming state-of-the-art gas source localization solutions. With the knowledge of source locations, subsequent mitigation operations, such as Rapid and accurate responses to gas leak incidents are shutting off valves or sealing the leaks, can be conducted more essential for safeguarding human and environmental health, logically, efficiently, and safely [5].

artificial intelligence, machine learning, robot, (17 more...)

arXiv.org Artificial Intelligence

2411.06121

Country:

North America > United States (1.00)
Asia (0.69)

Genre: Research Report > Promising Solution (0.34)

Industry: Energy > Oil & Gas (0.94)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Bi-directional Momentum-based Haptic Feedback and Control System for Dexterous Telemanipulation

Wang, Haoyang, Guo, Haoran, Ba, He, Li, Zhengxiong, Tao, Lingfeng

arXiv.org Artificial IntelligenceSep-30-2024

Haptic feedback is essential for dexterous telemanipulation that enables operators to control robotic hands remotely with high skill and precision, mimicking a human hand's natural movement and sensation. However, current haptic methods for dexterous telemanipulation cannot support torque feedback, resulting in object rotation and rolling mismatches. The operator must make tedious adjustments in these tasks, leading to delays, reduced situational awareness, and suboptimal task performance. This work presents a Bi-directional Momentum-based Haptic Feedback and Control (Bi-Hap) system for real-time dexterous telemanipulation. Bi-Hap integrates multi-modal sensors to extract human interactive information with the object and share it with the robot's learning-based controller. A Field-Oriented Control (FOC) algorithm is developed to enable the integrated brushless active momentum wheel to generate precise torque and vibrative feedback, bridging the gap between human intent and robotic actions. Different feedback strategies are designed for varying error states to align with the operator's intuition. Extensive experiments with human subjects using a virtual Shadow Dexterous Hand demonstrate the effectiveness of Bi-Hap in enhancing task performance and user confidence. Bi-Hap achieved real-time feedback capability with low command following latency (delay<0.025s) and highly accurate torque feedback (RMSE<0.010 Nm).

artificial intelligence, bi-hap system, operator, (15 more...)

arXiv.org Artificial Intelligence

2409.20527

Country: North America > United States > Colorado (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Impact of Tactile Sensor Quantities and Placements on Learning-based Dexterous Manipulation

Guo, Haoran, Wang, Haoyang, Li, Zhengxiong, Bai, He, Tao, Lingfeng

arXiv.org Artificial IntelligenceSep-30-2024

Tactile information effectively enables faster training and better task performance for learning-based in-hand manipulation. Existing approaches are validated in simulated environments with a large number of tactile sensors. However, attaching such sensors to a real robot hand is not applicable due to high cost and physical limitations. To enable real-world adoption of tactile sensors, this study investigates the impact of tactile sensors, including their varying quantities and placements on robot hands, on the dexterous manipulation task performance and analyzes the importance of each. Through empirically decreasing the sensor quantities, we successfully find an optimized set of tactile sensors (21 sensors) configuration, which keeps over 93% task performance with only 20% sensor quantities compared to the original set (92 sensors) for the block manipulation task, leading to a potential reduction of over 80% in sensor manufacturing and design costs. To transform the empirical results into a generalizable understanding, we build a task performance prediction model with a weighted linear regression algorithm and use it to forecast the task performance with different sensor configurations. To show its generalizability, we verified this model in egg and pen manipulation tasks and achieved an average prediction error of 3.12%.

artificial intelligence, machine learning, sensor, (18 more...)

arXiv.org Artificial Intelligence

2409.20473

Country:

North America > United States > Oklahoma (0.28)
North America > United States > Colorado (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)

Add feedback

ITCMA: A Generative Agent Based on a Computational Consciousness Structure

Zhang, Hanzhong, Yin, Jibin, Wang, Haoyang, Xiang, Ziwei

arXiv.org Artificial IntelligenceJun-8-2024

Large Language Models (LLMs) still face challenges in tasks requiring understanding implicit instructions and applying common-sense knowledge. In such scenarios, LLMs may require multiple attempts to achieve human-level performance, potentially leading to inaccurate responses or inferences in practical environments, affecting their long-term consistency and behavior. This paper introduces the Internal Time-Consciousness Machine (ITCM), a computational consciousness structure to simulate the process of human consciousness. We further propose the ITCM-based Agent (ITCMA), which supports action generation and reasoning in open-world settings, and can independently complete tasks. ITCMA enhances LLMs' ability to understand implicit instructions and apply common-sense knowledge by considering agents' interaction and reasoning with the environment. Evaluations in the Alfworld environment show that trained ITCMA outperforms the state-of-the-art (SOTA) by 9% on the seen set. Even untrained ITCMA achieves a 96% task completion rate on the seen set, 5% higher than SOTA, indicating its superiority over traditional intelligent agents in utility and generalization. In real-world tasks with quadruped robots, the untrained ITCMA achieves an 85% task completion rate, which is close to its performance in the unseen set, demonstrating its comparable utility and universality in real-world settings.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2403.20097

Country:

Asia > China > Yunnan Province (0.14)
Asia > China > Fujian Province (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Leisure & Entertainment (0.92)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

TransformLoc: Transforming MAVs into Mobile Localization Infrastructures in Heterogeneous Swarms

Wang, Haoyang, Xu, Jingao, Zhao, Chenyu, Lu, Zihong, Cheng, Yuhan, Chen, Xuecheng, Zhang, Xiao-Ping, Liu, Yunhao, Chen, Xinlei

arXiv.org Artificial IntelligenceFeb-14-2024

A heterogeneous micro aerial vehicles (MAV) swarm consists of resource-intensive but expensive advanced MAVs (AMAVs) and resource-limited but cost-effective basic MAVs (BMAVs), offering opportunities in diverse fields. Accurate and real-time localization is crucial for MAV swarms, but current practices lack a low-cost, high-precision, and real-time solution, especially for lightweight BMAVs. We find an opportunity to accomplish the task by transforming AMAVs into mobile localization infrastructures for BMAVs. However, turning this insight into a practical system is non-trivial due to challenges in location estimation with BMAVs' unknown and diverse localization errors and resource allocation of AMAVs given coupled influential factors. This study proposes TransformLoc, a new framework that transforms AMAVs into mobile localization infrastructures, specifically designed for low-cost and resource-constrained BMAVs. We first design an error-aware joint location estimation model to perform intermittent joint location estimation for BMAVs and then design a proximity-driven adaptive grouping-scheduling strategy to allocate resources of AMAVs dynamically. TransformLoc achieves a collaborative, adaptive, and cost-effective localization system suitable for large-scale heterogeneous MAV swarms. We implement TransformLoc on industrial drones and validate its performance. Results show that TransformLoc outperforms baselines including SOTA up to 68\% in localization performance, motivating up to 60\% navigation success rate improvement.

artificial intelligence, bmav, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2403.08815

Country: Asia > China > Guangdong Province (0.15)

Genre: Research Report (0.84)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback