AITopics | Jiang, Qingyuan

Collaborating Authors

Jiang, Qingyuan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Rebalanced Multimodal Learning with Data-aware Unimodal Sampling

Jiang, Qingyuan, Chi, Zhouyang, Ma, Xiao, Mao, Qirong, Yang, Yang, Tang, Jinhui

arXiv.org Artificial IntelligenceMar-5-2025

To address the modality learning degeneration caused by modality imbalance, existing multimodal learning~(MML) approaches primarily attempt to balance the optimization process of each modality from the perspective of model learning. However, almost all existing methods ignore the modality imbalance caused by unimodal data sampling, i.e., equal unimodal data sampling often results in discrepancies in informational content, leading to modality imbalance. Therefore, in this paper, we propose a novel MML approach called \underline{D}ata-aware \underline{U}nimodal \underline{S}ampling~(\method), which aims to dynamically alleviate the modality imbalance caused by sampling. Specifically, we first propose a novel cumulative modality discrepancy to monitor the multimodal learning process. Based on the learning status, we propose a heuristic and a reinforcement learning~(RL)-based data-aware unimodal sampling approaches to adaptively determine the quantity of sampled data at each iteration, thus alleviating the modality imbalance from the perspective of sampling. Meanwhile, our method can be seamlessly incorporated into almost all existing multimodal learning approaches as a plugin. Experiments demonstrate that \method~can achieve the best performance by comparing with diverse state-of-the-art~(SOTA) baselines.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2503.03792

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

The Solution for Temporal Action Localisation Task of Perception Test Challenge 2024

Han, Yinan, Jiang, Qingyuan, Mei, Hongming, Yang, Yang, Tang, Jinhui

arXiv.org Artificial IntelligenceOct-7-2024

Each action is represented by start and end timestamps along This report presents our method for Temporal Action with its corresponding class label, as illustrated in Figure1. Localisation (TAL), which focuses on identifying and classifying This task is critical for various applications, including actions within specific time intervals throughout a video surveillance, content analysis, and human-computer video sequence. We employ a data augmentation technique interaction.The dataset provided for this challenge is derived by expanding the training dataset using overlapping labels from the Perception Test, comprising high-resolution from the Something-SomethingV2 dataset, enhancing the videos (up to 35 seconds long, 30fps, and a maximum resolution model's ability to generalize across various action classes. of 1080p). Each video contains multiple action segment For feature extraction, we utilize state-of-the-art models, including annotations. To facilitate experimentation, both video UMT, VideoMAEv2 for video features, and BEATs and audio features are provided, along with detailed annotations and CAV-MAE for audio features. Our approach involves for the training and validation phases.

artificial intelligence, dataset, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2410.09088

Country:

North America > United States (0.70)
North America > Canada (0.47)

Genre: Research Report > Promising Solution (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

Map-Aware Human Pose Prediction for Robot Follow-Ahead

Jiang, Qingyuan, Susam, Burak, Chao, Jun-Jee, Isler, Volkan

arXiv.org Artificial IntelligenceMar-20-2024

In the robot follow-ahead task, a mobile robot is tasked to maintain its relative position in front of a moving human actor while keeping the actor in sight. To accomplish this task, it is important that the robot understand the full 3D pose of the human (since the head orientation can be different than the torso) and predict future human poses so as to plan accordingly. This prediction task is especially tricky in a complex environment with junctions and multiple corridors. In this work, we address the problem of forecasting the full 3D trajectory of a human in such environments. Our main insight is to show that one can first predict the 2D trajectory and then estimate the full 3D trajectory by conditioning the estimator on the predicted 2D trajectory. With this approach, we achieve results comparable or better than the state-of-the-art methods three times faster. As part of our contribution, we present a new dataset where, in contrast to existing datasets, the human motion is in a much larger area than a single room. We also present a complete robot system that integrates our human pose forecasting network on the mobile robot to enable real-time robot follow-ahead and present results from real-world experiments in multiple buildings on campus. Our project page, including supplementary material and videos, can be found at: https://qingyuan-jiang.github.io/iros2024_poseForecasting/

artificial intelligence, machine learning, trajectory, (14 more...)

arXiv.org Artificial Intelligence

2403.13294

Country:

North America > United States (0.93)
Europe > Austria > Vienna (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Onboard View Planning of a Flying Camera for High Fidelity 3D Reconstruction of a Moving Actor

Jiang, Qingyuan, Isler, Volkan

arXiv.org Artificial IntelligenceJul-31-2023

Capturing and reconstructing a human actor's motion is important for filmmaking and gaming. Currently, motion capture systems with static cameras are used for pixel-level high-fidelity reconstructions. Such setups are costly, require installation and calibration and, more importantly, confine the user to a predetermined area. In this work, we present a drone-based motion capture system that can alleviate these limitations. We present a complete system implementation and study view planning which is critical for achieving high-quality reconstructions. The main challenge for view planning for a drone-based capture system is that it needs to be performed during motion capture. To address this challenge, we introduce simple geometric primitives and show that they can be used for view planning. Specifically, we introduce Pixel-Per-Area (PPA) as a reconstruction quality proxy and plan views by maximizing the PPA of the faces of a simple geometric shape representing the actor. Through experiments in simulation, we show that PPA is highly correlated with reconstruction quality. We also conduct real-world experiments showing that our system can produce dynamic 3D reconstructions of good quality. We share our code for the simulation experiments in the link: https://github.com/Qingyuan-Jiang/view_planning_3dhuman

actor, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2308.00134

Country:

North America > Canada > Quebec (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.75)

Add feedback