AITopics | Zhong, Tianqi

Collaborating Authors

Zhong, Tianqi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Out-of-distribution Generalization for Total Variation based Invariant Risk Minimization

Wang, Yuanchao, Lai, Zhao-Rong, Zhong, Tianqi

arXiv.org Artificial IntelligenceFeb-28-2025

Invariant risk minimization is an important general machine learning framework that has recently been interpreted as a total variation model (IRM-TV). However, how to improve out-of-distribution (OOD) generalization in the IRM-TV setting remains unsolved. In this paper, we extend IRM-TV to a Lagrangian multiplier model named OOD-TV -IRM. We find that the autonomous TV penalty hyperpa-rameter is exactly the Lagrangian multiplier. Thus OOD-TV -IRM is essentially a primal-dual optimization model, where the primal optimization minimizes the entire invariant risk and the dual optimization strengthens the TV penalty. The objective is to reach a semi-Nash equilibrium where the balance between the training loss and OOD generalization is maintained. We also develop a convergent primal-dual algorithm that facilitates an adversarial learning scheme. Experimental results show that OOD-TV -IRM outperforms IRM-TV in most situations. Traditional risk minimization methods such as Empirical Risk Minimization (ERM) are widely used in machine learning. ERM generally assumes that both training and test data come from the same distribution. Based on this assumption, ERM learns model parameters by minimizing the average loss on the training data.

artificial intelligence, generalization, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2502.19665

Country:

Asia > China (0.14)
Africa > Rwanda (0.14)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Benchmarking and Improving Compositional Generalization of Multi-aspect Controllable Text Generation

Zhong, Tianqi, Li, Zhaoyi, Wang, Quan, Song, Linqi, Wei, Ying, Lian, Defu, Mao, Zhendong

arXiv.org Artificial IntelligenceJun-3-2024

Compositional generalization, representing the model's ability to generate text with new attribute combinations obtained by recombining single attributes from the training data, is a crucial property for multi-aspect controllable text generation (MCTG) methods. Nonetheless, a comprehensive compositional generalization evaluation benchmark of MCTG is still lacking. We propose CompMCTG, a benchmark encompassing diverse multi-aspect labeled datasets and a crafted three-dimensional evaluation protocol, to holistically evaluate the compositional generalization of MCTG approaches. We observe that existing MCTG works generally confront a noticeable performance drop in compositional testing. To mitigate this issue, we introduce Meta-MCTG, a training framework incorporating meta-learning, where we enable models to learn how to generalize by simulating compositional generalization scenarios in the training phase. We demonstrate the effectiveness of Meta-MCTG through achieving obvious improvement (by at most 3.64%) for compositional testing performance in 94.4% cases.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2404.04232

Country:

Europe (1.00)
Asia > China (0.28)
Asia > Middle East > UAE (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.45)

Industry:

Leisure & Entertainment (0.46)
Energy (0.45)
Consumer Products & Services (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Air-Decoding: Attribute Distribution Reconstruction for Decoding-Time Controllable Text Generation

Zhong, Tianqi, Wang, Quan, Han, Jingxuan, Zhang, Yongdong, Mao, Zhendong

arXiv.org Artificial IntelligenceNov-1-2023

Controllable text generation (CTG) aims to generate text with desired attributes, and decoding-time-based methods have shown promising performance on this task. However, in this paper, we identify the phenomenon of Attribute Collapse for the first time. It causes the fluency of generated text to rapidly decrease when the control strength exceeds a critical value, rendering the text completely unusable. This limitation hinders the effectiveness of decoding methods in achieving high levels of controllability. To address this problem, we propose a novel lightweight decoding framework named Air-Decoding. Its main idea is reconstructing the attribute distributions to balance the weights between attribute words and non-attribute words to generate more fluent text. Specifically, we train prefixes by prefix-tuning to obtain attribute distributions. Then we design a novel attribute distribution reconstruction method to balance the obtained distributions and use the reconstructed distributions to guide language models for generation, effectively avoiding the issue of Attribute Collapse. Experiments on multiple CTG tasks prove that our method achieves a new state-of-the-art control performance.

artificial intelligence, decoding-time controllable text generation, natural language, (2 more...)

arXiv.org Artificial Intelligence

2310.14892

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.89)

Add feedback