AITopics | Yang, Cheng-Yen

Collaborating Authors

Yang, Cheng-Yen

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

PackDiT: Joint Human Motion and Text Generation via Mutual Prompting

Jiang, Zhongyu, Chai, Wenhao, Zhou, Zhuoran, Yang, Cheng-Yen, Huang, Hsiang-Wei, Hwang, Jenq-Neng

arXiv.org Artificial IntelligenceJan-27-2025

Human motion generation has advanced markedly with the advent of diffusion models. Most recent studies have concentrated on generating motion sequences based on text prompts, commonly referred to as text-to-motion generation. However, the bidirectional generation of motion and text, enabling tasks such as motion-to-text alongside text-to-motion, has been largely unexplored. This capability is essential for aligning diverse modalities and supports unconditional generation. In this paper, we introduce PackDiT, the first diffusion-based generative model capable of performing various tasks simultaneously, including motion generation, motion prediction, text generation, text-to-motion, motion-to-text, and joint motion-text generation. Our core innovation leverages mutual blocks to integrate multiple diffusion transformers (DiTs) across different modalities seamlessly. We train PackDiT on the HumanML3D dataset, achieving state-of-the-art text-to-motion performance with an FID score of 0.106, along with superior results in motion prediction and in-between tasks. Our experiments further demonstrate that diffusion models are effective for motion-to-text generation, achieving performance comparable to that of autoregressive models.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2501.16551

Country: Europe > Germany (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

The 2nd Workshop on Maritime Computer Vision (MaCVi) 2024

Kiefer, Benjamin, Žust, Lojze, Kristan, Matej, Perš, Janez, Teršek, Matija, Wiliem, Arnold, Messmer, Martin, Yang, Cheng-Yen, Huang, Hsiang-Wei, Jiang, Zhongyu, Kuo, Heng-Cheng, Mei, Jie, Hwang, Jenq-Neng, Stadler, Daniel, Sommer, Lars, Huang, Kaer, Zheng, Aiguo, Chong, Weitu, Lertniphonphan, Kanokphan, Xie, Jun, Chen, Feng, Li, Jian, Wang, Zhepeng, Zedda, Luca, Loddo, Andrea, Di Ruberto, Cecilia, Vu, Tuan-Anh, Nguyen-Truong, Hai, Ha, Tan-Sang, Pham, Quan-Dung, Yeung, Sai-Kit, Feng, Yuan, Thien, Nguyen Thanh, Tian, Lixin, Kuan, Sheng-Yao, Ho, Yuan-Hao, Rodriguez, Angel Bueno, Carrillo-Perez, Borja, Klein, Alexander, Alex, Antje, Steiniger, Yannik, Sattler, Felix, Solano-Carrillo, Edgardo, Fabijanić, Matej, Šumunec, Magdalena, Kapetanović, Nadir, Michel, Andreas, Gross, Wolfgang, Weinmann, Martin

arXiv.org Artificial IntelligenceNov-23-2023

The 2nd Workshop on Maritime Computer Vision (MaCVi) 2024 addresses maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicles (USV). Three challenges categories are considered: (i) UAV-based Maritime Object Tracking with Re-identification, (ii) USV-based Maritime Obstacle Segmentation and Detection, (iii) USV-based Maritime Boat Tracking. The USV-based Maritime Obstacle Segmentation and Detection features three sub-challenges, including a new embedded challenge addressing efficicent inference on real-world embedded devices. This report offers a comprehensive overview of the findings from the challenges. We provide both statistical and qualitative analyses, evaluating trends from over 195 submissions. All datasets, evaluation code, and the leaderboard are available to the public at https://macvi.org/workshop/macvi24.

artificial intelligence, machine learning, survey article, (19 more...)

arXiv.org Artificial Intelligence

2311.14762

Country:

Europe > Croatia (0.28)
Europe > Germany > Baden-Württemberg (0.14)

Genre:

Overview (0.66)
Research Report (0.50)

Industry:

Information Technology (0.49)
Transportation (0.46)
Aerospace & Defense (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Efficient Domain Adaptation via Generative Prior for 3D Infant Pose Estimation

Zhou, Zhuoran, Jiang, Zhongyu, Chai, Wenhao, Yang, Cheng-Yen, Li, Lei, Hwang, Jenq-Neng

arXiv.org Artificial IntelligenceNov-17-2023

Although 3D human pose estimation has gained impressive development in recent years, only a few works focus on infants, that have different bone lengths and also have limited data. Directly applying adult pose estimation models typically achieves low performance in the infant domain and suffers from out-of-distribution issues. Moreover, the limitation of infant pose data collection also heavily constrains the efficiency of learning-based models to lift 2D poses to 3D. To deal with the issues of small datasets, domain adaptation and data augmentation are commonly used techniques. Following this paradigm, we take advantage of an optimization-based method that utilizes generative priors to predict 3D infant keypoints from 2D keypoints without the need of large training data. We further apply a guided diffusion model to domain adapt 3D adult pose to infant pose to supplement small datasets. Besides, we also prove that our method, ZeDO-i, could attain efficient domain adaptation, even if only a small number of data is given. Quantitatively, we claim that our model attains state-of-the-art MPJPE performance of 43.6 mm on the SyRIP dataset and 21.2 mm on the MINI-RGBD dataset.

artificial intelligence, machine learning, pose estimation, (17 more...)

arXiv.org Artificial Intelligence

2311.12043

Country:

Europe > Netherlands (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.87)

Add feedback

Back to Optimization: Diffusion-based Zero-Shot 3D Human Pose Estimation

Jiang, Zhongyu, Zhou, Zhuoran, Li, Lei, Chai, Wenhao, Yang, Cheng-Yen, Hwang, Jenq-Neng

arXiv.org Artificial IntelligenceOct-24-2023

Learning-based methods have dominated the 3D human pose estimation (HPE) tasks with significantly better performance in most benchmarks than traditional optimization-based methods. Nonetheless, 3D HPE in the wild is still the biggest challenge for learning-based models, whether with 2D-3D lifting, image-to-3D, or diffusion-based methods, since the trained networks implicitly learn camera intrinsic parameters and domain-based 3D human pose distributions and estimate poses by statistical average. On the other hand, the optimization-based methods estimate results case-by-case, which can predict more diverse and sophisticated human poses in the wild. By combining the advantages of optimization-based and learning-based methods, we propose the \textbf{Ze}ro-shot \textbf{D}iffusion-based \textbf{O}ptimization (\textbf{ZeDO}) pipeline for 3D HPE to solve the problem of cross-domain and in-the-wild 3D HPE. Our multi-hypothesis \textit{\textbf{ZeDO}} achieves state-of-the-art (SOTA) performance on Human3.6M, with minMPJPE $51.4$mm, without training with any 2D-3D or image-3D pairs. Moreover, our single-hypothesis \textit{\textbf{ZeDO}} achieves SOTA performance on 3DPW dataset with PA-MPJPE $40.3$mm on cross-dataset evaluation, which even outperforms learning-based methods trained on 3DPW.

artificial intelligence, human pose estimation, optimization, (1 more...)

arXiv.org Artificial Intelligence

2307.03833

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.80)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.60)

Add feedback

1st Workshop on Maritime Computer Vision (MaCVi) 2023: Challenge Results

Kiefer, Benjamin, Kristan, Matej, Perš, Janez, Žust, Lojze, Poiesi, Fabio, Andrade, Fabio Augusto de Alcantara, Bernardino, Alexandre, Dawkins, Matthew, Raitoharju, Jenni, Quan, Yitong, Atmaca, Adem, Höfer, Timon, Zhang, Qiming, Xu, Yufei, Zhang, Jing, Tao, Dacheng, Sommer, Lars, Spraul, Raphael, Zhao, Hangyue, Zhang, Hongpu, Zhao, Yanyun, Augustin, Jan Lukas, Jeon, Eui-ik, Lee, Impyeong, Zedda, Luca, Loddo, Andrea, Di Ruberto, Cecilia, Verma, Sagar, Gupta, Siddharth, Muralidhara, Shishir, Hegde, Niharika, Xing, Daitao, Evangeliou, Nikolaos, Tzes, Anthony, Bartl, Vojtěch, Špaňhel, Jakub, Herout, Adam, Bhowmik, Neelanjan, Breckon, Toby P., Kundargi, Shivanand, Anvekar, Tejas, Desai, Chaitra, Tabib, Ramesh Ashok, Mudengudi, Uma, Vats, Arpita, Song, Yang, Liu, Delong, Li, Yonglin, Li, Shuman, Tan, Chenhao, Lan, Long, Somers, Vladimir, De Vleeschouwer, Christophe, Alahi, Alexandre, Huang, Hsiang-Wei, Yang, Cheng-Yen, Hwang, Jenq-Neng, Kim, Pyong-Kun, Kim, Kwangju, Lee, Kyoungoh, Jiang, Shuai, Li, Haiwen, Ziqiang, Zheng, Vu, Tuan-Anh, Nguyen-Truong, Hai, Yeung, Sai-Kit, Jia, Zhuang, Yang, Sophia, Hsu, Chih-Chung, Hou, Xiu-Yu, Jhang, Yu-An, Yang, Simon, Yang, Mau-Tsuen

arXiv.org Artificial IntelligenceNov-28-2022

The 1$^{\text{st}}$ Workshop on Maritime Computer Vision (MaCVi) 2023 focused on maritime computer vision for Unmanned Aerial Vehicles (UAV) and Unmanned Surface Vehicle (USV), and organized several subchallenges in this domain: (i) UAV-based Maritime Object Detection, (ii) UAV-based Maritime Object Tracking, (iii) USV-based Maritime Obstacle Segmentation and (iv) USV-based Maritime Obstacle Detection. The subchallenges were based on the SeaDronesSee and MODS benchmarks. This report summarizes the main findings of the individual subchallenges and introduces a new benchmark, called SeaDronesSee Object Detection v2, which extends the previous benchmark by including more classes and footage. We provide statistical and qualitative analyses, and assess trends in the best-performing methodologies of over 130 submissions. The methods are summarized in the appendix. The datasets, evaluation code and the leaderboard are publicly available at https://seadronessee.cs.uni-tuebingen.de/macvi.

artificial intelligence, detection, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2211.13508

Country:

North America > United States (0.67)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.24)

Genre: Research Report (0.81)

Industry:

Government (1.00)
Transportation (0.67)
Energy > Renewable (0.46)
Information Technology > Robotics & Automation (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback