AITopics | Tang, Hongxuan

Collaborating Authors

Tang, Hongxuan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

UGen: Unified Autoregressive Multimodal Model with Progressive Vocabulary Learning

Tang, Hongxuan, Liu, Hao, Xiao, Xinyan

arXiv.org Artificial IntelligenceMar-27-2025

We introduce UGen, a unified autoregressive multimodal model that demonstrates strong performance across text processing, image understanding, and image generation tasks simultaneously. UGen converts both texts and images into discrete token sequences and utilizes a single transformer to generate them uniformly in an autoregressive manner. To address the challenges associated with unified multimodal learning, UGen is trained using a novel mechanism, namely progressive vocabulary learning. In this process, visual token IDs are incrementally activated and integrated into the training phase, ultimately enhancing the effectiveness of unified multimodal learning. Experiments on comprehensive text and image tasks show that UGen achieves a significant overall performance improvement of 13.3% compared to the vanilla unified autoregressive method, and it also delivers competitive results across all tasks against several task-specific models.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2503.21193

Country: Asia (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)

Add feedback

Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Models

Wang, Shaobo, Tang, Hongxuan, Wang, Mingyang, Zhang, Hongrui, Liu, Xuyang, Li, Weiya, Hu, Xuming, Zhang, Linfeng

arXiv.org Artificial IntelligenceOct-29-2024

The debate between self-interpretable models and post-hoc explanations for blackbox models is central to Explainable AI (XAI). Self-interpretable models, such as concept-based networks, offer insights by connecting decisions to humanunderstandable concepts but often struggle with performance and scalability. Conversely, post-hoc methods like Shapley values, while theoretically robust, are computationally expensive and resource-intensive. To bridge the gap between these two lines of research, we propose a novel method that combines their strengths, providing theoretically guaranteed self-interpretability for black-box models without compromising prediction accuracy. Specifically, we introduce a parameter-efficient pipeline, AutoGnothi, which integrates a small side network into the black-box model, allowing it to generate Shapley value explanations without changing the original network parameters. This side-tuning approach significantly reduces memory, training, and inference costs, outperforming traditional parameter-efficient methods, where full fine-tuning serves as the optimal baseline. AutoGnothi enables the black-box model to predict and explain its predictions with minimal overhead. Extensive experiments show that AutoGnothi offers accurate explanations for both vision and language tasks, delivering superior computational efficiency with comparable interpretability. Explainable AI (XAI) has gained increasing significance as AI systems are widely deployed in both vision (Dosovitskiy, 2020; Radford et al., 2021; Kirillov et al., 2023) and language domains (Devlin et al., 2019; Brown, 2020; Achiam et al., 2023). Ensuring interpretability in these systems is vital for fostering trust, ensuring fairness, and adhering to legal standards, particularly for complex models such as transformers. As illustrated in Figure 1(a), the ideal paradigm for XAI involves designing inherently transparent models that deliver superior performance.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2410.21815

Country:

Asia > China (0.28)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.67)

Industry: Transportation > Air (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.68)
(2 more...)

Add feedback

Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation

Chen, Jingchang, Tang, Hongxuan, Chu, Zheng, Chen, Qianglong, Wang, Zekun, Liu, Ming, Qin, Bing

arXiv.org Artificial IntelligenceMay-30-2024

Despite recent progress made by large language models in code generation, they still struggle with programs that meet complex requirements. Recent work utilizes plan-and-solve decomposition to decrease the complexity and leverage self-tests to refine the generated program. Yet, planning deep-inside requirements in advance can be challenging, and the tests need to be accurate to accomplish self-improvement. To this end, we propose FunCoder, a code generation framework incorporating the divide-and-conquer strategy with functional consensus. Specifically, FunCoder recursively branches off sub-functions as smaller goals during code generation, represented by a tree hierarchy. These sub-functions are then composited to attain more complex objectives. Additionally, we designate functions via a consensus formed by identifying similarities in program behavior, mitigating error propagation. FunCoder outperforms state-of-the-art methods by +9.8% on average in HumanEval, MBPP, xCodeEval and MATH with GPT-3.5 and GPT-4. Moreover, our method demonstrates superiority on smaller models: With FunCoder, StableCode-3b surpasses GPT-3.5 by +18.6% and achieves 97.7% of GPT-4's performance on HumanEval. Further analysis reveals that our proposed dynamic function decomposition is capable of handling complex requirements, and the functional consensus prevails over self-testing in correctness evaluation.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2405.20092

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Hawaii (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Fine-grained Interpretability Evaluation Benchmark for Neural NLP

Wang, Lijie, Shen, Yaozong, Peng, Shuyuan, Zhang, Shuai, Xiao, Xinyan, Liu, Hao, Tang, Hongxuan, Chen, Ying, Wu, Hua, Wang, Haifeng

arXiv.org Artificial IntelligenceNov-14-2022

While there is increasing concern about the interpretability of neural models, the evaluation of interpretability remains an open problem, due to the lack of proper evaluation datasets and metrics. In this paper, we present a novel benchmark to evaluate the interpretability of both neural models and saliency methods. This benchmark covers three representative NLP tasks: sentiment analysis, textual similarity and reading comprehension, each provided with both English and Chinese annotated data. In order to precisely evaluate the interpretability, we provide token-level rationales that are carefully annotated to be sufficient, compact and comprehensive. We also design a new metric, i.e., the consistency between the rationales before and after perturbations, to uniformly evaluate the interpretability on different types of tasks. Based on this benchmark, we conduct experiments on three typical models with three saliency methods, and unveil their strengths and weakness in terms of interpretability. We will release this benchmark https://www.luge.ai/#/luge/task/taskDetail?taskId=15 and hope it can facilitate the research in building trustworthy systems.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2205.11097

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback