AITopics | Liao, Yi

Collaborating Authors

Liao, Yi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Dynamic Accumulated Attention Map for Interpreting Evolution of Decision-Making in Vision Transformer

Liao, Yi, Gao, Yongsheng, Zhang, Weichuan

arXiv.org Artificial IntelligenceMar-18-2025

Various Vision Transformer (ViT) models have been widely used for image recognition tasks. However, existing visual explanation methods can not display the attention flow hidden inside the inner structure of ViT models, which explains how the final attention regions are formed inside a ViT for its decision-making. In this paper, a novel visual explanation approach, Dynamic Accumulated Attention Map (DAAM), is proposed to provide a tool that can visualize, for the first time, the attention flow from the top to the bottom through ViT networks. To this end, a novel decomposition module is proposed to construct and store the spatial feature information by unlocking the [class] token generated by the self-attention module of each ViT block. The module can also obtain the channel importance coefficients by decomposing the classification score for supervised ViT models. Because of the lack of classification score in self-supervised ViT models, we propose dimension-wise importance weights to compute the channel importance coefficients. Such spatial features are linearly combined with the corresponding channel importance coefficients, forming the attention map for each block. The dynamic attention flow is revealed by block-wisely accumulating each attention map. The contribution of this work focuses on visualizing the evolution dynamic of the decision-making attention for any intermediate block inside a ViT model by proposing a novel decomposition module and dimension-wise importance weights. The quantitative and qualitative analysis consistently validate the effectiveness and superior capacity of the proposed DAAM for not only interpreting ViT models with the fully-connected layers as the classifier but also self-supervised ViT models. The code is available at https://github.com/ly9802/DynamicAccumulatedAttentionMap.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2503.1464

Genre: Research Report (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Tgea: An error-annotated dataset and benchmark tasks for text generation from pretrained language models

He, Jie, Peng, Bo, Liao, Yi, Liu, Qun, Xiong, Deyi

arXiv.org Artificial IntelligenceMar-6-2025

In order to deeply understand the capability of pretrained language models in text generation and conduct a diagnostic evaluation, we propose TGEA, an error-annotated dataset with multiple benchmark tasks for text generation from pretrained language models (PLMs). We use carefully selected prompt words to guide GPT-2 to generate candidate sentences, from which we select 47K for error annotation. Crowdsourced workers manually check each of these sentences and detect 12k erroneous sentences. We create an error taxonomy to cover 24 types of errors occurring in these erroneous sentences according to the nature of errors with respect to linguistics and knowledge (eg, common sense). For each erroneous span in PLM-generated sentences, we also detect another span that is closely associated with it. Each error is hence manually labeled with comprehensive annotations, including the span of the error, the associated span, minimal correction to the error, the type of the error, and rationale behind the error. Apart from the fully annotated dataset, we also present a detailed description of the data collection procedure, statistics and analysis of the dataset. This is the first dataset with comprehensive annotations for PLM-generated texts, which facilitates the diagnostic evaluation of PLM-based text generation. Furthermore, we use TGEA as a benchmark dataset and propose a series of automatic diagnosis tasks, including error detection, error type classification, associated span detection, error rationale generation, to further promote future study on the automatic error detection and correction on texts generated by pretrained language models.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.04232

Country:

Europe (1.00)
Asia > China (0.94)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Sports (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Neuron Abandoning Attention Flow: Visual Explanation of Dynamics inside CNN Models

Liao, Yi, Gao, Yongsheng, Zhang, Weichuan

arXiv.org Artificial IntelligenceDec-2-2024

In this paper, we present a Neuron Abandoning Attention Flow (NAFlow) method to address the open problem of visually explaining the attention evolution dynamics inside CNNs when making their classification decisions. A novel cascading neuron abandoning back-propagation algorithm is designed to trace neurons in all layers of a CNN that involve in making its prediction to address the problem of significant interference from abandoned neurons. Firstly, a Neuron Abandoning Back-Propagation (NA-BP) module is proposed to generate Back-Propagated Feature Maps (BPFM) by using the inverse function of the intermediate layers of CNN models, on which the neurons not used for decision-making are abandoned. Meanwhile, the cascading NA-BP modules calculate the tensors of importance coefficients which are linearly combined with the tensors of BPFMs to form the NAFlow. Secondly, to be able to visualize attention flow for similarity metric-based CNN models, a new channel contribution weights module is proposed to calculate the importance coefficients via Jacobian Matrix. The effectiveness of the proposed NAFlow is validated on nine widely-used CNN models for various tasks of general image classification, contrastive learning classification, few-shot image classification, and image retrieval.

artificial intelligence, cnn model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2412.01202

Country:

Asia > China (0.68)
Oceania > Australia > Queensland > Brisbane (0.14)

Genre: Research Report (0.40)

Industry:

Education (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

SATA: Spatial Autocorrelation Token Analysis for Enhancing the Robustness of Vision Transformers

Nikzad, Nick, Liao, Yi, Gao, Yongsheng, Zhou, Jun

arXiv.org Artificial IntelligenceSep-29-2024

Over the past few years, vision transformers (ViTs) have consistently demonstrated remarkable performance across various visual recognition tasks. However, attempts to enhance their robustness have yielded limited success, mainly focusing on different training strategies, input patch augmentation, or network structural enhancements. These approaches often involve extensive training and fine-tuning, which are time-consuming and resource-intensive. To tackle these obstacles, we introduce a novel approach named Spatial Autocorrelation Token Analysis (SATA). By harnessing spatial relationships between token features, SATA enhances both the representational capacity and robustness of ViT models. This is achieved through the analysis and grouping of tokens according to their spatial autocorrelation scores prior to their input into the Feed-Forward Network (FFN) block of the self-attention mechanism. Importantly, SATA seamlessly integrates into existing pre-trained ViT baselines without requiring retraining or additional fine-tuning, while concurrently improving efficiency by reducing the computational load of the FFN units. Experimental results show that the baseline ViTs enhanced with SATA not only achieve a new state-of-the-art top-1 accuracy on ImageNet-1K image classification (94.9%) but also establish new state-of-the-art performance across multiple robustness benchmarks, including ImageNet-A (top-1=63.6%), ImageNet-R (top-1=79.2%), and ImageNet-C (mCE=13.6%), all without requiring additional training or fine-tuning of baseline models.

artificial intelligence, machine learning, transformer, (15 more...)

arXiv.org Artificial Intelligence

2409.1985

Genre: Research Report > Promising Solution (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

SoftDedup: an Efficient Data Reweighting Method for Speeding Up Language Model Pre-training

He, Nan, Xiong, Weichen, Liu, Hanwen, Liao, Yi, Ding, Lei, Zhang, Kai, Tang, Guohua, Han, Xiao, Yang, Wei

arXiv.org Artificial IntelligenceJul-9-2024

The effectiveness of large language models (LLMs) is often hindered by duplicated data in their extensive pre-training datasets. Current approaches primarily focus on detecting and removing duplicates, which risks the loss of valuable information and neglects the varying degrees of duplication. To address this, we propose a soft deduplication method that maintains dataset integrity while selectively reducing the sampling weight of data with high commonness. Central to our approach is the concept of "data commonness", a metric we introduce to quantify the degree of duplication by measuring the occurrence probabilities of samples using an n-gram model. Empirical analysis shows that this method significantly improves training efficiency, achieving comparable perplexity scores with at least a 26% reduction in required training steps. Additionally, it enhances average few-shot downstream accuracy by 1.77% when trained for an equivalent duration. Importantly, this approach consistently improves performance, even on rigorously deduplicated datasets, indicating its potential to complement existing methods and become a standard pre-training process for LLMs.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2407.06654

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Add feedback

Feature Activation Map: Visual Explanation of Deep Learning Models for Image Classification

Liao, Yi, Gao, Yongsheng, Zhang, Weichuan

arXiv.org Artificial IntelligenceJul-11-2023

Decisions made by convolutional neural networks(CNN) can be understood and explained by visualizing discriminative regions on images. To this end, Class Activation Map (CAM) based methods were proposed as powerful interpretation tools, making the prediction of deep learning models more explainable, transparent, and trustworthy. However, all the CAM-based methods (e.g., CAM, Grad-CAM, and Relevance-CAM) can only be used for interpreting CNN models with fully-connected (FC) layers as a classifier. It is worth noting that many deep learning models classify images without FC layers, e.g., few-shot learning image classification, contrastive learning image classification, and image retrieval tasks. In this work, a post-hoc interpretation tool named feature activation map (FAM) is proposed, which can interpret deep learning models without FC layers as a classifier. In the proposed FAM algorithm, the channel-wise contribution weights are derived from the similarity scores between two image embeddings. The activation maps are linearly combined with the corresponding normalized contribution weights, forming the explanation map for visualization. The quantitative and qualitative experiments conducted on ten deep learning models for few-shot image classification, contrastive learning image classification and image retrieval tasks demonstrate the effectiveness of the proposed FAM algorithm.

artificial intelligence, explanation map, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2307.05017

Country:

Asia (0.46)
North America > United States (0.46)
Oceania > Australia > Queensland > Brisbane (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GPT-based Generation for Classical Chinese Poetry

Liao, Yi, Wang, Yasheng, Liu, Qun, Jiang, Xin

arXiv.org Artificial IntelligenceJul-12-2019

We present a simple yet effective method for generating high quality classical Chinese poetry with Generative Pre-trained Language Model (GPT). The method adopts a simple GPT model, without using any human crafted rules or features, or designing any additional neural components. While the proposed model learns to generate various forms of classical Chinese poems, including Jueju, L\"{u}shi, various Cipai and Couples, the generated poems are of very high quality. We also propose and implement a method to fine-tune the model to generate acrostic poetry. To the best of our knowledge, this is the first to employ GPT in developing a poetry generation system. We will release an online demonstration system in the near future to show the generation capability of the proposed method for classical Chinese poetry.

artificial intelligence, natural language, poem, (19 more...)

arXiv.org Artificial Intelligence

1907.00151

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Reader-Aware Multi-Document Summarization via Sparse Coding

Li, Piji (The Chinese University of Hong Kong) | Bing, Lidong (Carnegie Mellon University) | Lam, Wai (The Chinese University of Hong Kong) | Li, Hang (Huawei Technologies) | Liao, Yi (The Chinese University of Hong Kong)

AAAI ConferencesJul-15-2015

We propose a new MDS paradigm called reader-aware multi-document summarization (RA-MDS).Specifically, a set of reader comments associated with the news reports are also collected. The generated summaries from the reports for the event should be salient according to not only the reports but also the reader comments. To tackle this RA-MDS problem, we propose a sparse-coding-based method that is able to calculate the salience of the text units by jointly considering news reports and reader comments. Another reader-aware characteristic of our framework is to improve linguistic quality via entity rewriting. The rewriting consideration is jointly assessed together with other summarization requirements under a unified optimization model. To support the generation of compressive summaries via optimization, we explore a finer syntactic unit, namely, noun/verb phrase. In this work, we also generate a data set for conducting RA-MDS. Extensive experiments on this data set and some classical data sets demonstrate the effectiveness of our proposed approach.

deep learning, neural network, summarization, (22 more...)

AAAI Conferences

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

Asia (1.00)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback