AITopics | Jiang, Huaizu

Collaborating Authors

Jiang, Huaizu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ODTFormer: Efficient Obstacle Detection and Tracking with Stereo Cameras Based on Transformer

Ding, Tianye, Li, Hongyu, Jiang, Huaizu

arXiv.org Artificial IntelligenceMar-21-2024

Obstacle detection and tracking represent a critical component in robot autonomous navigation. In this paper, we propose ODTFormer, a Transformer-based model to address both obstacle detection and tracking problems. For the detection task, our approach leverages deformable attention to construct a 3D cost volume, which is decoded progressively in the form of voxel occupancy grids. We further track the obstacles by matching the voxels between consecutive frames. The entire model can be optimized in an end-to-end manner. Through extensive experiments on DrivingStereo and KITTI benchmarks, our model achieves state-of-the-art performance in the obstacle detection task. We also report comparable accuracy to state-of-the-art obstacle tracking models while requiring only a fraction of their computation cost, typically ten-fold to twenty-fold less. The code and model weights will be publicly released.

artificial intelligence, image understanding, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2403.14626

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.95)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

StereoNavNet: Learning to Navigate using Stereo Cameras with Auxiliary Occupancy Voxels

Li, Hongyu, Padir, Taskin, Jiang, Huaizu

arXiv.org Artificial IntelligenceMar-18-2024

Visual navigation has received significant attention recently. Most of the prior works focus on predicting navigation actions based on semantic features extracted from visual encoders. However, these approaches often rely on large datasets and exhibit limited generalizability. In contrast, our approach draws inspiration from traditional navigation planners that operate on geometric representations, such as occupancy maps. We propose StereoNavNet (SNN), a novel visual navigation approach employing a modular learning framework comprising perception and policy modules. Within the perception module, we estimate an auxiliary 3D voxel occupancy grid from stereo RGB images and extract geometric features from it. These features, along with user-defined goals, are utilized by the policy module to predict navigation actions. Through extensive empirical evaluation, we demonstrate that SNN outperforms baseline approaches in terms of success rates, success weighted by path length, and navigation error. Furthermore, SNN exhibits better generalizability, characterized by maintaining leading performance when navigating across previously unseen environments.

artificial intelligence, machine learning, module, (14 more...)

arXiv.org Artificial Intelligence

2403.12039

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

NeuFlow: Real-time, High-accuracy Optical Flow Estimation on Robots Using Edge Devices

Zhang, Zhiyong, Jiang, Huaizu, Singh, Hanumant

arXiv.org Artificial IntelligenceMar-15-2024

Real-time high-accuracy optical flow estimation is a crucial component in various applications, including localization and mapping in robotics, object tracking, and activity recognition in computer vision. While recent learning-based optical flow methods have achieved high accuracy, they often come with heavy computation costs. In this paper, we propose a highly efficient optical flow architecture, called NeuFlow, that addresses both high accuracy and computational cost concerns. The architecture follows a global-to-local scheme. Given the features of the input images extracted at different spatial resolutions, global matching is employed to estimate an initial optical flow on the 1/16 resolution, capturing large displacement, which is then refined on the 1/8 resolution with lightweight CNN layers for better accuracy. We evaluate our approach on Jetson Orin Nano and RTX 2080 to demonstrate efficiency improvements across different computing platforms. We achieve a notable 10x-80x speedup compared to several state-of-the-art methods, while maintaining comparable accuracy. Our approach achieves around 30 FPS on edge computing platforms, which represents a significant breakthrough in deploying complex computer vision tasks such as SLAM on small robots like drones. The full training and evaluation code is available at https://github.com/neufieldrobotics/NeuFlow.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2403.10425

Country: Europe > France (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

E(2)-Equivariant Graph Planning for Navigation

Zhao, Linfeng, Li, Hongyu, Padir, Taskin, Jiang, Huaizu, Wong, Lawson L. S.

arXiv.org Artificial IntelligenceJan-27-2024

Learning for robot navigation presents a critical and challenging task. The scarcity and costliness of real-world datasets necessitate efficient learning approaches. In this letter, we exploit Euclidean symmetry in planning for 2D navigation, which originates from Euclidean transformations between reference frames and enables parameter sharing. To address the challenges of unstructured environments, we formulate the navigation problem as planning on a geometric graph and develop an equivariant message passing network to perform value iteration. Furthermore, to handle multi-camera input, we propose a learnable equivariant layer to lift features to a desired space. We conduct comprehensive evaluations across five diverse tasks encompassing structured and unstructured environments, along with maps of known and unknown, given point goals or semantic goals. Our experiments confirm the substantial benefits on training efficiency, stability, and generalization. More details can be found at the project website: https://lhy.xyz/e2-planning/.

artificial intelligence, machine learning, symmetry, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2024.3360011

2309.13043

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions

Jiang, Huaizu, Ma, Xiaojian, Nie, Weili, Yu, Zhiding, Zhu, Yuke, Zhu, Song-Chun, Anandkumar, Anima

arXiv.org Artificial IntelligenceApr-13-2023

A significant gap remains between today's visual pattern recognition models and human-level visual cognition especially when it comes to few-shot learning and compositional reasoning of novel concepts. We introduce Bongard-HOI, a new visual reasoning benchmark that focuses on compositional learning of human-object interactions (HOIs) from natural images. It is inspired by two desirable characteristics from the classical Bongard problems (BPs): 1) few-shot concept learning, and 2) context-dependent reasoning. We carefully curate the few-shot instances with hard negatives, where positive and negative images only disagree on action labels, making mere recognition of object categories insufficient to complete our benchmarks. We also design multiple test sets to systematically study the generalization of visual learning models, where we vary the overlap of the HOI concepts between the training and test sets of few-shot instances, from partial to no overlaps. Bongard-HOI presents a substantial challenge to today's visual recognition models. The state-of-the-art HOI detection model achieves only 62% accuracy on few-shot binary prediction while even amateur human testers on MTurk have 91% accuracy. With the Bongard-HOI benchmark, we hope to further advance research efforts in visual reasoning, especially in holistic perception-reasoning systems and better representation learning.

benchmark, machine learning, pattern recognition, (18 more...)

arXiv.org Artificial Intelligence

2205.13803

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.68)

Add feedback

StereoVoxelNet: Real-Time Obstacle Detection Based on Occupancy Voxels from a Stereo Camera Using Deep Neural Networks

Li, Hongyu, Li, Zhengang, Akmandor, Neset Unver, Jiang, Huaizu, Wang, Yanzhi, Padir, Taskin

arXiv.org Artificial IntelligenceMar-4-2023

Obstacle detection is a safety-critical problem in robot navigation, where stereo matching is a popular vision-based approach. While deep neural networks have shown impressive results in computer vision, most of the previous obstacle detection works only leverage traditional stereo matching techniques to meet the computational constraints for real-time feedback. This paper proposes a computationally efficient method that employs a deep neural network to detect occupancy from stereo images directly. Instead of learning the point cloud correspondence from the stereo data, our approach extracts the compact obstacle distribution based on volumetric representations. In addition, we prune the computation of safety irrelevant spaces in a coarse-to-fine manner based on octrees generated by the decoder. As a result, we achieve real-time performance on the onboard computer (NVIDIA Jetson TX2). Our approach detects obstacles accurately in the range of 32 meters and achieves better IoU (Intersection over Union) and CD (Chamfer Distance) scores with only 2% of the computation cost of the state-of-the-art stereo model. Furthermore, we validate our method's robustness and real-world feasibility through autonomous navigation experiments with a real robot. Hence, our work contributes toward closing the gap between the stereo-based system in robot perception and state-of-the-art stereo models in computer vision. To counter the scarcity of high-quality real-world indoor stereo datasets, we collect a 1.36 hours stereo dataset with a mobile robot which is used to fine-tune our model. The dataset, the code, and further details including additional visualizations are available at https://lhy.xyz/stereovoxelnet

artificial intelligence, international conference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2209.08459

Country: North America > United States (0.46)

Genre: Research Report (0.82)

Industry: Information Technology > Robotics & Automation (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Salient Object Detection: A Survey

Borji, Ali, Cheng, Ming-Ming, Hou, Qibin, Jiang, Huaizu, Li, Jia

arXiv.org Artificial IntelligenceSep-3-2017

Detecting and segmenting salient objects in natural scenes, often referred to as salient object detection, has attracted a lot of interest in computer vision. While many models have been proposed and several applications have emerged, yet a deep understanding of achievements and issues is lacking. We aim to provide a comprehensive review of the recent progress in salient object detection and situate this field among other closely related areas such as generic scene segmentation, object proposal generation, and saliency for fixation prediction. Covering 228 publications, we survey i) roots, key concepts, and tasks, ii) core techniques and main modeling trends, and iii) datasets and evaluation metrics in salient object detection. We also discuss open problems such as evaluation metrics and dataset bias in model performance and suggest future research directions.

deep learning, neural network, salient, (23 more...)

arXiv.org Artificial Intelligence

1411.5878

Country:

Asia > China (0.67)
Asia > Middle East > Iran (0.14)
North America > United States > Florida > Orange County > Orlando (0.14)

Industry:

Health & Medicine > Therapeutic Area (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Add feedback