AITopics | Wang, Jiaxu

Collaborating Authors

Wang, Jiaxu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Mamba as Decision Maker: Exploring Multi-scale Sequence Modeling in Offline Reinforcement Learning

Cao, Jiahang, Zhang, Qiang, Wang, Ziqing, Wang, Jiaxu, Cheng, Hao, Shao, Yecheng, Zhao, Wen, Han, Gang, Guo, Yijie, Xu, Renjing

arXiv.org Artificial IntelligenceJun-4-2024

Sequential modeling has demonstrated remarkable capabilities in offline reinforcement learning (RL), with Decision Transformer (DT) being one of the most notable representatives, achieving significant success. However, RL trajectories possess unique properties to be distinguished from the conventional sequence (e.g., text or audio): (1) local correlation, where the next states in RL are theoretically determined solely by current states and actions based on the Markov Decision Process (MDP), and (2) global correlation, where each step's features are related to long-term historical information due to the time-continuous nature of trajectories. In this paper, we propose a novel action sequence predictor, named Mamba Decision Maker (MambaDM), where Mamba is expected to be a promising alternative for sequence modeling paradigms, owing to its efficient modeling of multi-scale dependencies. In particular, we introduce a novel mixer module that proficiently extracts and integrates both global and local features of the input sequence, effectively capturing interrelationships in RL datasets. Extensive experiments demonstrate that MambaDM achieves state-of-the-art performance in Atari and OpenAI Gym datasets. Furthermore, we empirically investigate the scaling laws of MambaDM, finding that increasing model size does not bring performance improvement, but scaling the dataset amount by 2x for MambaDM can obtain up to 33.7% score improvement on Atari dataset. This paper delves into the sequence modeling capabilities of MambaDM in the RL domain, paving the way for future advancements in robust and efficient decision-making systems. Our code will be available at https://github.com/AndyCao1125/MambaDM.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2406.02013

Country: Asia > China (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

EvGGS: A Collaborative Learning Framework for Event-based Generalizable Gaussian Splatting

Wang, Jiaxu, He, Junhao, Zhang, Ziyi, Sun, Mingyuan, Sun, Jingkai, Xu, Renjing

arXiv.org Artificial IntelligenceJun-3-2024

Event cameras offer promising advantages such as high dynamic range and low latency, making them well-suited for challenging lighting conditions and fast-moving scenarios. However, reconstructing 3D scenes from raw event streams is difficult because event data is sparse and does not carry absolute color information. To release its potential in 3D reconstruction, we propose the first event-based generalizable 3D reconstruction framework, called EvGGS, which reconstructs scenes as 3D Gaussians from only event input in a feedforward manner and can generalize to unseen cases without any retraining. This framework includes a depth estimation module, an intensity reconstruction module, and a Gaussian regression module. These submodules connect in a cascading manner, and we collaboratively train them with a designed joint loss to make them mutually promote. To facilitate related studies, we build a novel event-based 3D dataset with various material objects and calibrated labels of grayscale images, depth maps, camera poses, and silhouettes. Experiments show models that have jointly trained significantly outperform those trained individually. Our approach performs better than all baselines in reconstruction quality, and depth/intensity predictions with satisfactory rendering speed.

artificial intelligence, machine learning, reconstruction, (15 more...)

arXiv.org Artificial Intelligence

2405.14959

Country:

Asia > China (0.46)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Health Care Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Physical Priors Augmented Event-Based 3D Reconstruction

Wang, Jiaxu, He, Junhao, Zhang, Ziyi, Xu, Renjing

arXiv.org Artificial IntelligenceJan-30-2024

3D neural implicit representations play a significant component in many robotic applications. However, reconstructing neural radiance fields (NeRF) from realistic event data remains a challenge due to the sparsities and the lack of information when only event streams are available. In this paper, we utilize motion, geometry, and density priors behind event data to impose strong physical constraints to augment NeRF training. The proposed novel pipeline can directly benefit from those priors to reconstruct 3D scenes without additional inputs. Moreover, we present a novel density-guided patch-based sampling strategy for robust and efficient learning, which not only accelerates training procedures but also conduces to expressions of local geometries. More importantly, we establish the first large dataset for event-based 3D reconstruction, which contains 101 objects with various materials and geometries, along with the groundtruth of images and depth maps for all camera viewpoints, which significantly facilitates other research in the related fields. The code and dataset will be publicly available at https://github.com/Mercerai/PAEv3d.

artificial intelligence, conference, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2401.17121

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Chasing Day and Night: Towards Robust and Efficient All-Day Object Detection Guided by an Event Camera

Cao, Jiahang, Zheng, Xu, Lyu, Yuanhuiyi, Wang, Jiaxu, Xu, Renjing, Wang, Lin

arXiv.org Artificial IntelligenceSep-17-2023

The ability to detect objects in all lighting (i.e., normal-, over-, and under-exposed) conditions is crucial for real-world applications, such as self-driving.Traditional RGB-based detectors often fail under such varying lighting conditions.Therefore, recent works utilize novel event cameras to supplement or guide the RGB modality; however, these methods typically adopt asymmetric network structures that rely predominantly on the RGB modality, resulting in limited robustness for all-day detection. In this paper, we propose EOLO, a novel object detection framework that achieves robust and efficient all-day detection by fusing both RGB and event modalities. Our EOLO framework is built based on a lightweight spiking neural network (SNN) to efficiently leverage the asynchronous property of events. Buttressed by it, we first introduce an Event Temporal Attention (ETA) module to learn the high temporal information from events while preserving crucial edge information. Secondly, as different modalities exhibit varying levels of importance under diverse lighting conditions, we propose a novel Symmetric RGB-Event Fusion (SREF) module to effectively fuse RGB-Event features without relying on a specific modality, thus ensuring a balanced and adaptive fusion for all-day detection. In addition, to compensate for the lack of paired RGB-Event datasets for all-day training and evaluation, we propose an event synthesis approach based on the randomized optical flow that allows for directly generating the event frame from a single exposure image. We further build two new datasets, E-MSCOCO and E-VOC based on the popular benchmarks MSCOCO and PASCAL VOC. Extensive experiments demonstrate that our EOLO outperforms the state-of-the-art detectors,e.g.,RENet,by a substantial margin (+3.74% mAP50) in all lighting conditions.Our code and datasets will be available at https://vlislab22.github.io/EOLO/

artificial intelligence, detection, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2309.09297

Country:

Asia > China (0.28)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback