AITopics | Yu, Heng

Collaborating Authors

Yu, Heng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

End-to-end Generative Spatial-Temporal Ultrasonic Odometry and Mapping Framework

Jia, Fuhua, Yang, Xiaoying, Yang, Mengshen, Li, Yang, Xu, Hang, Rushworth, Adam, Ijaz, Salman, Yu, Heng, Cui, Tianxiang

arXiv.org Artificial IntelligenceDec-23-2024

Performing simultaneous localization and mapping (SLAM) in low-visibility conditions, such as environments filled with smoke, dust and transparent objets, has long been a challenging task. Sensors like cameras and Light Detection and Ranging (LiDAR) are significantly limited under these conditions, whereas ultrasonic sensors offer a more robust alternative. However, the low angular resolution, slow update frequency, and limited detection accuracy of ultrasonic sensors present barriers for SLAM. In this work, we propose a novel end-to-end generative ultrasonic SLAM framework. This framework employs a sensor array with overlapping fields of view, leveraging the inherently low angular resolution of ultrasonic sensors to implicitly encode spatial features in conjunction with the robot's motion. Consecutive time frame data is processed through a sliding window mechanism to capture temporal features. The spatiotemporally encoded sensor data is passed through multiple modules to generate dense scan point clouds and robot pose transformations for map construction and odometry. The main contributions of this work include a novel ultrasonic sensor array that spatiotemporally encodes the surrounding environment, and an end-to-end generative SLAM framework that overcomes the inherent defects of ultrasonic sensors. Several real-world experiments demonstrate the feasibility and robustness of the proposed framework.

artificial intelligence, sensor, spatial reasoning, (13 more...)

arXiv.org Artificial Intelligence

2412.17343

Country: Asia > China (0.31)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.71)

Add feedback

WeightedPose: Generalizable Cross-Pose Estimation via Weighted SVD

Cheng, Xuxin, Yu, Heng, Zhang, Harry, Deng, Wenxing

arXiv.org Artificial IntelligenceMay-21-2024

We introduce a new approach for robotic manipulation tasks in human settings that necessitates understanding the 3D geometric connections between a pair of objects. Conventional end-to-end training approaches, which convert pixel observations directly into robot actions, often fail to effectively understand complex pose relationships and do not easily adapt to new object configurations. To overcome these issues, our method focuses on learning the 3D geometric relationships, particularly how critical parts of one object relate to those of another. We employ Weighted SVD in our standalone model to analyze pose relationships both in articulated parts and in free-floating objects.

artificial intelligence, international conference, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2405.02241

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.42)

Add feedback

Instruct2Attack: Language-Guided Semantic Adversarial Attacks

Liu, Jiang, Wei, Chen, Guo, Yuxiang, Yu, Heng, Yuille, Alan, Feizi, Soheil, Lau, Chun Pong, Chellappa, Rama

arXiv.org Artificial IntelligenceNov-27-2023

We propose Instruct2Attack (I2A), a language-guided semantic attack that generates semantically meaningful perturbations according to free-form language instructions. We make use of state-of-the-art latent diffusion models, where we adversarially guide the reverse diffusion process to search for an adversarial latent code conditioned on the input image and text instruction. Compared to existing noise-based and semantic attacks, I2A generates more natural and diverse adversarial examples while providing better controllability and interpretability. We further automate the attack process with GPT-4 to generate diverse image-specific text instructions. We show that I2A can successfully break state-of-the-art deep neural networks even under strong adversarial defenses, and demonstrate great transferability among a variety of network architectures.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2311.15551

Country: North America > United States > Maryland (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Government > Military (0.83)
Information Technology > Security & Privacy (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Sequence Generation: From Both Sides to the Middle

Zhou, Long, Zhang, Jiajun, Zong, Chengqing, Yu, Heng

arXiv.org Artificial IntelligenceJun-23-2019

The encoder-decoder framework has achieved promising process for many sequence generation tasks, such as neural machine translation and text summarization. Such a framework usually generates a sequence token by token from left to right, hence (1) this autoregressive decoding procedure is time-consuming when the output sentence becomes longer, and (2) it lacks the guidance of future context which is crucial to avoid under translation. To alleviate these issues, we propose a synchronous bidirectional sequence generation (SBSG) model which predicts its outputs from both sides to the middle simultaneously. In the SBSG model, we enable the left-to-right (L2R) and right-to-left (R2L) generation to help and interact with each other by leveraging interactive bidirectional attention network. Experiments on neural machine translation (En-De, Ch-En, and En-Ro) and text summarization tasks show that the proposed model significantly speeds up decoding while improving the generation quality compared to the autoregressive Transformer.

artificial intelligence, machine translation, transformer, (18 more...)

arXiv.org Artificial Intelligence

1906.09601

Country: Asia > China (0.29)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback