AITopics | Wang, Jiahao

Collaborating Authors

Wang, Jiahao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Animal3D: A Comprehensive Dataset of 3D Animal Pose and Shape

Xu, Jiacong, Zhang, Yi, Peng, Jiawei, Ma, Wufei, Jesslen, Artur, Ji, Pengliang, Hu, Qixin, Zhang, Jiehua, Liu, Qihao, Wang, Jiahao, Ji, Wei, Wang, Chen, Yuan, Xiaoding, Kaushik, Prakhar, Zhang, Guofeng, Liu, Jie, Xie, Yushan, Cui, Yawen, Yuille, Alan, Kortylewski, Adam

arXiv.org Artificial IntelligenceAug-22-2023

Accurately estimating the 3D pose and shape is an essential step towards understanding animal behavior, and can potentially benefit many downstream applications, such as wildlife conservation. However, research in this area is held back by the lack of a comprehensive and diverse dataset with high-quality 3D pose and shape annotations. In this paper, we propose Animal3D, the first comprehensive dataset for mammal animal 3D pose and shape estimation. Animal3D consists of 3379 images collected from 40 mammal species, high-quality annotations of 26 keypoints, and importantly the pose and shape parameters of the SMAL model. All annotations were labeled and checked manually in a multi-stage process to ensure highest quality results. Based on the Animal3D dataset, we benchmark representative shape and pose estimation models at: (1) supervised learning from only the Animal3D data, (2) synthetic to real transfer from synthetically generated images, and (3) fine-tuning human pose and shape estimation models. Our experimental results demonstrate that predicting the 3D shape and pose of animals across species remains a very challenging task, despite significant advances in human pose estimation. Our results further demonstrate that synthetic pre-training is a viable strategy to boost the model performance. Overall, Animal3D opens new directions for facilitating future research in animal 3D pose and shape estimation, and is publicly available.

artificial intelligence, estimation, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2308.11737

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > Canada > Alberta (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.60)

Add feedback

4D Millimeter-Wave Radar in Autonomous Driving: A Survey

Han, Zeyu, Wang, Jiahao, Xu, Zikun, Yang, Shuocheng, He, Lei, Xu, Shaobing, Wang, Jianqiang

arXiv.org Artificial IntelligenceJun-14-2023

The 4D millimeter-wave (mmWave) radar, capable of measuring the range, azimuth, elevation, and velocity of targets, has attracted considerable interest in the autonomous driving community. This is attributed to its robustness in extreme environments and outstanding velocity and elevation measurement capabilities. However, despite the rapid development of research related to its sensing theory and application, there is a notable lack of surveys on the topic of 4D mmWave radar. To address this gap and foster future research in this area, this paper presents a comprehensive survey on the use of 4D mmWave radar in autonomous driving. Reviews on the theoretical background and progress of 4D mmWave radars are presented first, including the signal processing flow, resolution improvement ways, extrinsic calibration process, and point cloud generation methods. Then it introduces related datasets and application algorithms in autonomous driving perception and localization and mapping tasks. Finally, this paper concludes by predicting future trends in the field of 4D mmWave radar. To the best of our knowledge, this is the first survey specifically for the 4D mmWave radar.

artificial intelligence, machine learning, radar, (18 more...)

arXiv.org Artificial Intelligence

2306.04242

Country:

Asia (0.93)
North America > United States (0.28)

Genre:

Research Report (1.00)
Overview (0.86)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

RIFormer: Keep Your Vision Backbone Effective While Removing Token Mixer

Wang, Jiahao, Zhang, Songyang, Liu, Yong, Wu, Taiqiang, Yang, Yujiu, Liu, Xihui, Chen, Kai, Luo, Ping, Lin, Dahua

arXiv.org Artificial IntelligenceApr-12-2023

This paper studies how to keep a vision backbone effective while removing token mixers in its basic building blocks. Token mixers, as self-attention for vision transformers (ViTs), are intended to perform information communication between different spatial tokens but suffer from considerable computational cost and latency. However, directly removing them will lead to an incomplete model structure prior, and thus brings a significant accuracy drop. To this end, we first develop an RepIdentityFormer base on the re-parameterizing idea, to study the token mixer free model architecture. And we then explore the improved learning paradigm to break the limitation of simple token mixer free backbone, and summarize the empirical practice into 5 guidelines. Equipped with the proposed optimization strategy, we are able to build an extremely simple vision backbone with encouraging performance, while enjoying the high efficiency during inference. Extensive experiments and ablative analysis also demonstrate that the inductive bias of network architecture, can be incorporated into simple network structure with appropriate optimization strategy. We hope this work can serve as a starting point for the exploration of optimization-driven efficient network design. Project page: https://techmonsterwang.github.io/RIFormer/.

artificial intelligence, machine learning, token mixer, (15 more...)

arXiv.org Artificial Intelligence

2304.05659

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Edge-free but Structure-aware: Prototype-Guided Knowledge Distillation from GNNs to MLPs

Wu, Taiqiang, Zhao, Zhe, Wang, Jiahao, Bai, Xingyu, Wang, Lei, Wong, Ngai, Yang, Yujiu

arXiv.org Artificial IntelligenceMar-27-2023

Distilling high-accuracy Graph Neural Networks~(GNNs) to low-latency multilayer perceptrons~(MLPs) on graph tasks has become a hot research topic. However, MLPs rely exclusively on the node features and fail to capture the graph structural information. Previous methods address this issue by processing graph edges into extra inputs for MLPs, but such graph structures may be unavailable for various scenarios. To this end, we propose a Prototype-Guided Knowledge Distillation~(PGKD) method, which does not require graph edges~(edge-free) yet learns structure-aware MLPs. Specifically, we analyze the graph structural information in GNN teachers, and distill such information from GNNs to MLPs via prototypes in an edge-free setting. Experimental results on popular graph benchmarks demonstrate the effectiveness and robustness of the proposed PGKD.

artificial intelligence, gnn teacher, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2303.13763

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

The Multi-Modal Video Reasoning and Analyzing Competition

Peng, Haoran, Huang, He, Xu, Li, Li, Tianjiao, Liu, Jun, Rahmani, Hossein, Ke, Qiuhong, Guo, Zhicheng, Wu, Cong, Li, Rongchang, Ye, Mang, Wang, Jiahao, Zhang, Jiaxu, Liu, Yuanzhong, He, Tao, Zhang, Fuwei, Liu, Xianbin, Lin, Tao

arXiv.org Artificial IntelligenceAug-18-2021

In this paper, we introduce the Multi-Modal Video Reasoning and Analyzing Competition (MMVRAC) workshop in conjunction with ICCV 2021. This competition is composed of four different tracks, namely, video question answering, skeleton-based action recognition, fisheye video-based action recognition, and person re-identification, which are based on two datasets: SUTD-TrafficQA and UAV-Human. We summarize the top-performing methods submitted by the participants in this competition and show their results achieved in the competition.

artificial intelligence, natural language, recognition, (16 more...)

arXiv.org Artificial Intelligence

2108.08344

Country: Asia (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.36)

Add feedback