AITopics | Zhao, Songtao

Collaborating Authors

Zhao, Songtao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

I2VControl-Camera: Precise Video Camera Control with Adjustable Motion Strength

Feng, Wanquan, Liu, Jiawei, Tu, Pengqi, Qi, Tianhao, Sun, Mingzhen, Ma, Tianxiang, Zhao, Songtao, Zhou, Siyu, He, Qian

arXiv.org Artificial IntelligenceNov-25-2024

Video generation technologies are developing rapidly and have broad potential applications. Among these technologies, camera control is crucial for generating professional-quality videos that accurately meet user expectations. However, existing camera control methods still suffer from several limitations, including control precision and the neglect of the control for subject motion dynamics. In this work, we propose I2VControl-Camera, a novel camera control method that significantly enhances controllability while providing adjustability over the strength of subject motion. To improve control precision, we employ point trajectory in the camera coordinate system instead of only extrinsic matrix information as our control signal. To accurately control and adjust the strength of subject motion, we explicitly model the higher-order components of the video trajectory expansion, not merely the linear terms, and design an operator that effectively represents the motion strength. We use an adapter architecture that is independent of the base model structure. Experiments on static and dynamic scenes show that our framework outperformances previous methods both quantitatively and qualitatively. The project page is: https://wanquanf.github.io/I2VControlCamera .

artificial intelligence, machine learning, motion strength, (12 more...)

arXiv.org Artificial Intelligence

2411.06525

Genre: Research Report (0.64)

Industry:

Media > Photography (0.84)
Media > Film (0.84)
Media > Television (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Graphics (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Augmentation-Aware Self-Supervision for Data-Efficient GAN Training

Hou, Liang, Cao, Qi, Yuan, Yige, Zhao, Songtao, Ma, Chongyang, Pan, Siyuan, Wan, Pengfei, Wang, Zhongyuan, Shen, Huawei, Cheng, Xueqi

arXiv.org Artificial IntelligenceDec-27-2023

Training generative adversarial networks (GANs) with limited data is challenging because the discriminator is prone to overfitting. Previously proposed differentiable augmentation demonstrates improved data efficiency of training GANs. However, the augmentation implicitly introduces undesired invariance to augmentation for the discriminator since it ignores the change of semantics in the label space caused by data transformation, which may limit the representation learning ability of the discriminator and ultimately affect the generative modeling performance of the generator. To mitigate the negative impact of invariance while inheriting the benefits of data augmentation, we propose a novel augmentation-aware self-supervised discriminator that predicts the augmentation parameter of the augmented data. Particularly, the prediction targets of real data and generated data are required to be distinguished since they are different during training. We further encourage the generator to adversarially learn from the self-supervised discriminator by generating augmentation-predictable real and not fake data. This formulation connects the learning objective of the generator and the arithmetic $-$ harmonic mean divergence under certain assumptions. We compare our method with state-of-the-art (SOTA) methods using the class-conditional BigGAN and unconditional StyleGAN2 architectures on data-limited CIFAR-10, CIFAR-100, FFHQ, LSUN-Cat, and five low-shot datasets. Experimental results demonstrate significant improvements of our method over SOTA methods in training data-efficient GANs.

artificial intelligence, discriminator, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2205.15677

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > China (0.14)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Add feedback