AITopics | tpdm

Collaborating Authors

tpdm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation

Ye, Zilyu, Chen, Zhiyang, Li, Tiancheng, Huang, Zemin, Luo, Weijian, Qi, Guo-Jun

arXiv.org Artificial IntelligenceDec-2-2024

Diffusion and flow models have achieved remarkable successes in various applications such as text-to-image generation. However, these models typically rely on the same predetermined denoising schedules during inference for each prompt, which potentially limits the inference efficiency as well as the flexibility when handling different prompts. In this paper, we argue that the optimal noise schedule should adapt to each inference instance, and introduce the Time Prediction Diffusion Model (TPDM) to accomplish this. TPDM employs a plug-and-play Time Prediction Module (TPM) that predicts the next noise level based on current latent features at each denoising step. We train the TPM using reinforcement learning, aiming to maximize a reward that discounts the final image quality by the number of denoising steps. With such an adaptive scheduler, TPDM not only generates high-quality images that are aligned closely with human preferences but also adjusts the number of denoising steps and time on the fly, enhancing both performance and efficiency. We train TPDMs on multiple diffusion model benchmarks. With Stable Diffusion 3 Medium architecture, TPDM achieves an aesthetic score of 5.44 and a human preference score (HPS) of 29.59, while using around 50% fewer denoising steps to achieve better performance. We will release our best model alongside this paper.

arxiv preprint arxiv, diffusion model, tpdm, (12 more...)

arXiv.org Artificial Intelligence

2412.01243

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > China (0.04)

Genre:

Research Report (0.64)
Workflow (0.47)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

TPDM: Selectively Removing Positional Information for Zero-shot Translation via Token-Level Position Disentangle Module

Chen, Xingran, Zhang, Ge, Fu, Jie

arXiv.org Artificial IntelligenceMay-31-2023

Due to Multilingual Neural Machine Translation's (MNMT) capability of zero-shot translation, many works have been carried out to fully exploit the potential of MNMT in zero-shot translation. It is often hypothesized that positional information may hinder the MNMT from outputting a robust encoded representation for decoding. However, previous approaches treat all the positional information equally and thus are unable to selectively remove certain positional information. In sharp contrast, this paper investigates how to learn to selectively preserve useful positional information. We describe the specific mechanism of positional information influencing MNMT from the perspective of linguistics at the token level. We design a token-level position disentangle module (TPDM) framework to disentangle positional information at the token level based on the explanation. Our experiments demonstrate that our framework improves zero-shot translation by a large margin while reducing the performance loss in the supervised direction compared to previous works.

information, positional information, translation, (15 more...)

arXiv.org Artificial Intelligence

2305.19857

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Dominican Republic (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
(6 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback