AITopics | Yuan, Jianlong

Collaborating Authors

Yuan, Jianlong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model

Huang, Haoyang, Ma, Guoqing, Duan, Nan, Chen, Xing, Wan, Changyi, Ming, Ranchen, Wang, Tianyu, Wang, Bo, Lu, Zhiying, Li, Aojie, Zeng, Xianfang, Zhang, Xinhao, Yu, Gang, Yin, Yuhe, Wu, Qiling, Sun, Wen, An, Kang, Han, Xin, Sun, Deshan, Ji, Wei, Huang, Bizhu, Li, Brian, Wu, Chenfei, Huang, Guanzhe, Xiong, Huixin, He, Jiaxin, Wu, Jianchang, Yuan, Jianlong, Wu, Jie, Liu, Jiashuai, Guo, Junjing, Tan, Kaijun, Chen, Liangyu, Chen, Qiaohui, Sun, Ran, Yuan, Shanshan, Yin, Shengming, Liu, Sitong, Chen, Wei, Dai, Yaqi, Luo, Yuchu, Ge, Zheng, Guan, Zhisheng, Song, Xiaoniu, Zhou, Yu, Jiao, Binxing, Chen, Jiansheng, Li, Jing, Zhou, Shuchang, Zhang, Xiangyu, Xiu, Yi, Zhu, Yibo, Shum, Heung-Yeung, Jiang, Daxin

arXiv.org Artificial IntelligenceMar-14-2025

We present Step-Video-TI2V, a state-of-the-art text-driven image-to-video generation model with 30B parameters, capable of generating videos up to 102 frames based on both text and image inputs. We build Step-Video-TI2V-Eval as a new benchmark for the text-driven image-to-video task and compare Step-Video-TI2V with open-source and commercial TI2V engines using this dataset. Experimental results demonstrate the state-of-the-art performance of Step-Video-TI2V in the image-to-video generation task.

artificial intelligence, machine learning, step-video-ti2v, (18 more...)

arXiv.org Artificial Intelligence

2503.11251

Genre: Research Report (0.71)

Industry: Media > Photography (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.97)

Add feedback

Semantic Data Augmentation based Distance Metric Learning for Domain Generalization

Wang, Mengzhu, Yuan, Jianlong, Qian, Qi, Wang, Zhibin, Li, Hao

arXiv.org Artificial IntelligenceSep-13-2022

Domain generalization (DG) aims to learn a model on one or more different but related source domains that could be generalized into an unseen target domain. Existing DG methods try to prompt the diversity of source domains for the model's generalization ability, while they may have to introduce auxiliary networks or striking computational costs. On the contrary, this work applies the implicit semantic augmentation in feature space to capture the diversity of source domains. Concretely, an additional loss function of distance metric learning (DML) is included to optimize the local geometry of data distribution. Besides, the logits from cross entropy loss with infinite augmentations is adopted as input features for the DML loss in lieu of the deep features. We also provide a theoretical analysis to show that the logits can approximate the distances defined on original features well. Further, we provide an in-depth analysis of the mechanism and rational behind our approach, which gives us a better understanding of why leverage logits in lieu of features can help domain generalization. The proposed DML loss with the implicit augmentation is incorporated into a recent DG method, that is, Fourier Augmented Co-Teacher framework (FACT). Meanwhile, our method also can be easily plugged into various DG methods. Extensive experiments on three benchmarks (Digits-DG, PACS and Office-Home) have demonstrated that the proposed method is able to achieve the state-of-the-art performance.

artificial intelligence, generalization, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2208.02803

Country: Europe > Portugal (0.16)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback