AITopics | Meng, Deyu

Collaborating Authors

Meng, Deyu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exploring Imbalanced Annotations for Effective In-Context Learning

Gao, Hongfu, Zhang, Feipeng, Zeng, Hao, Meng, Deyu, Jing, Bingyi, Wei, Hongxin

arXiv.org Artificial IntelligenceFeb-6-2025

Large language models (LLMs) have shown impressive performance on downstream tasks through in-context learning (ICL), which heavily relies on the demonstrations selected from annotated datasets. Existing selection methods may hinge on the distribution of annotated datasets, which can often be long-tailed in real-world scenarios. In this work, we show that imbalanced class distributions in annotated datasets significantly degrade the performance of ICL across various tasks and selection methods. Moreover, traditional rebalance methods fail to ameliorate the issue of class imbalance in ICL. Our method is motivated by decomposing the distributional differences between annotated and test datasets into two-component weights: class-wise weights and conditional bias. The key idea behind our method is to estimate the conditional bias by minimizing the empirical error on a balanced validation dataset and to employ the two-component weights to modify the original scoring functions during selection. Our approach can prevent selecting too many demonstrations from a single class while preserving the effectiveness of the original selection methods. Extensive experiments demonstrate the effectiveness of our method, improving the average accuracy by up to 5.46 on common benchmarks with imbalanced datasets.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.04037

Country: Europe > Belgium (0.14)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

S-LoRA: Scalable Low-Rank Adaptation for Class Incremental Learning

Wu, Yichen, Piao, Hongming, Huang, Long-Kai, Wang, Renzhen, Li, Wanhua, Pfister, Hanspeter, Meng, Deyu, Ma, Kede, Wei, Ying

arXiv.org Artificial IntelligenceJan-30-2025

Continual Learning (CL) with foundation models has recently emerged as a promising approach to harnessing the power of pre-trained models for sequential tasks. Existing prompt-based methods generally use a prompt selection mechanism to select relevant prompts aligned with the test query for further processing. However, the success of these methods largely depends on the precision of the selection mechanism, which also raises scalable issues with additional computational overhead as tasks increase. To overcome these issues, we propose a Scalable Low-Rank Adaptation (S-LoRA) method for class incremental learning, which incrementally decouples the learning of the direction and magnitude of LoRA parameters. S-LoRA supports efficient inference by employing the last-stage trained model for direct testing without the selection process. Our theoretical and empirical analysis demonstrates that S-LoRA tends to follow a low-loss trajectory that converges to an overlapped low-loss region, resulting in an excellent stability-plasticity trade-off in CL. Furthermore, based on our findings, we develop variants of S-LoRA with further improved scalability. Continual Learning (CL) (Rolnick et al., 2019; Wang et al., 2024b; Zhou et al., 2024; Wang et al., 2022b) seeks to develop a learning system that can continually adapt to changing environments while retaining previously acquired knowledge.

artificial intelligence, machine learning, s-lora, (16 more...)

arXiv.org Artificial Intelligence

2501.13198

Country:

North America > United States (0.14)
Asia > China (0.14)

Genre:

Research Report > Promising Solution (0.66)
Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Add feedback

Is AI Robust Enough for Scientific Research?

Zhang, Jun-Jie, Song, Jiahao, Wang, Xiu-Cheng, Li, Fu-Peng, Liu, Zehan, Chen, Jian-Nan, Dang, Haoning, Wang, Shiyao, Zhang, Yiyan, Xu, Jianhui, Shi, Chunxiang, Wang, Fei, Pang, Long-Gang, Cheng, Nan, Zhang, Weiwei, Zhang, Duo, Meng, Deyu

arXiv.org Artificial IntelligenceDec-18-2024

Artificial Intelligence (AI) has become a transformative tool in scientific research, driving breakthroughs across numerous disciplines [5-11]. Despite these achievements, neural networks, which form the backbone of many AI systems, exhibit significant vulnerabilities. One of the most concerning issues is their susceptibility to adversarial attacks [1, 2, 12, 13]. These attacks involve making small, often imperceptible changes to the input data, causing AI systems to make incorrect predictions (Figure 1), highlighting a critical weakness: AI systems can fail under minimal perturbations - a phenomenon completely unseen in classical methods. The impact of adversarial attacks has been extensively studied in the context of image classification [14-16].

artificial intelligence, machine learning, neural network, (19 more...)

arXiv.org Artificial Intelligence

2412.16234

Country:

Asia > China > Guangdong Province (0.14)
Asia > Japan > Honshū > Kantō (0.14)
Asia > China > Hubei Province (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (0.87)
Energy > Oil & Gas (0.68)
Government > Military (0.55)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Vision (0.87)

Add feedback

Chain-of-Restoration: Multi-Task Image Restoration Models are Zero-Shot Step-by-Step Universal Image Restorers

Cao, Jin, Meng, Deyu, Cao, Xiangyong

arXiv.org Artificial IntelligenceDec-3-2024

Despite previous image restoration (IR) methods have often concentrated on isolated degradations, recent research has increasingly focused on addressing composite degradations involving a complex combination of multiple isolated degradations. However, current IR methods for composite degradations require building training data that contain an exponential number of possible degradation combinations, which brings in a significant burden. To alleviate this issue, this paper proposes a new task setting, i.e. Universal Image Restoration (UIR). Specifically, UIR doesn't require training on all the degradation combinations but only on a set of degradation bases and then removing any degradation that these bases can potentially compose in a zero-shot manner. Inspired by the Chain-of-Thought that prompts large language models (LLMs) to address problems step-by-step, we propose Chain-of-Restoration (CoR) mechanism, which instructs models to remove unknown composite degradations step-by-step. By integrating a simple Degradation Discriminator into pre-trained multi-task models, CoR facilitates the process where models remove one degradation basis per step, continuing this process until the image is fully restored from the unknown composite degradation. Extensive experiments show that CoR can significantly improve model performance in removing composite degradations, achieving comparable or better results than those state-of-the-art (SoTA) methods trained on all degradations.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.08688

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

SpeGCL: Self-supervised Graph Spectrum Contrastive Learning without Positive Samples

Shou, Yuntao, Cao, Xiangyong, Meng, Deyu

arXiv.org Artificial IntelligenceOct-14-2024

Graph Contrastive Learning (GCL) excels at managing noise and fluctuations in input data, making it popular in various fields (e.g., social networks, and knowledge graphs). Our study finds that the difference in high-frequency information between augmented graphs is greater than that in low-frequency information. However, most existing GCL methods focus mainly on the time domain (low-frequency information) for node feature representations and cannot make good use of high-frequency information to speed up model convergence. Furthermore, existing GCL paradigms optimize graph embedding representations by pulling the distance between positive sample pairs closer and pushing the distance between positive and negative sample pairs farther away, but our theoretical analysis shows that graph contrastive learning benefits from pushing negative pairs farther away rather than pulling positive pairs closer. To solve the above-mentioned problems, we propose a novel spectral GCL framework without positive samples, named SpeGCL. Specifically, to solve the problem that existing GCL methods cannot utilize high-frequency information, SpeGCL uses a Fourier transform to extract high-frequency and low-frequency information of node features, and constructs a contrastive learning mechanism in a Fourier space to obtain better node feature representation. Furthermore, SpeGCL relies entirely on negative samples to refine the graph embedding. We also provide a theoretical justification for the efficacy of using only negative samples in SpeGCL. Extensive experiments on un-supervised learning, transfer learning, and semi-supervised learning have validated the superiority of our SpeGCL framework over the state-of-the-art GCL methods.

artificial intelligence, machine learning, representation, (17 more...)

arXiv.org Artificial Intelligence

2410.10365

Country:

Asia > China (0.28)
Oceania > Australia (0.28)
North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Information Technology (0.66)
Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Component-based Sketching for Deep ReLU Nets

Wang, Di, Lin, Shao-Bo, Meng, Deyu, Cao, Feilong

arXiv.org Artificial IntelligenceSep-21-2024

Deep learning has made profound impacts in the domains of data mining and AI, distinguished by the groundbreaking achievements in numerous real-world applications and the innovative algorithm design philosophy. However, it suffers from the inconsistency issue between optimization and generalization, as achieving good generalization, guided by the bias-variance trade-off principle, favors under-parameterized networks, whereas ensuring effective convergence of gradient-based algorithms demands over-parameterized networks. To address this issue, we develop a novel sketching scheme based on deep net components for various tasks. Specifically, we use deep net components with specific efficacy to build a sketching basis that embodies the advantages of deep networks. Subsequently, we transform deep net training into a linear empirical risk minimization problem based on the constructed basis, successfully avoiding the complicated convergence analysis of iterative algorithms. The efficacy of the proposed component-based sketching is validated through both theoretical analysis and numerical experiments. Theoretically, we show that the proposed component-based sketching provides almost optimal rates in approximating saturated functions for shallow nets and also achieves almost optimal generalization error bounds. Numerically, we demonstrate that, compared with the existing gradient-based training methods, component-based sketching possesses superior generalization performance with reduced training costs.

artificial intelligence, machine learning, neural network, (14 more...)

arXiv.org Artificial Intelligence

2409.14174

Country:

Asia > China (0.14)
North America > United States (0.14)

Genre: Research Report (0.69)

Industry: Energy (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

A Refreshed Similarity-based Upsampler for Direct High-Ratio Feature Upsampling

Zhou, Minghao, Wang, Hong, Zheng, Yefeng, Meng, Deyu

arXiv.org Artificial IntelligenceJul-2-2024

Feature upsampling is a fundamental and indispensable ingredient of almost all current network structures for image segmentation tasks. Recently, a popular similarity-based feature upsampling pipeline has been proposed, which utilizes a high-resolution feature as guidance to help upsample the low-resolution deep feature based on their local similarity. Albeit achieving promising performance, this pipeline has specific limitations: 1) HR query and LR key features are not well aligned; 2) the similarity between query-key features is computed based on the fixed inner product form; 3) neighbor selection is coarsely operated on LR features, resulting in mosaic artifacts. These shortcomings make the existing methods along this pipeline primarily applicable to hierarchical network architectures with iterative features as guidance and they are not readily extended to a broader range of structures, especially for a direct high-ratio upsampling. Against the issues, we meticulously optimize every methodological design. Specifically, we firstly propose an explicitly controllable query-key feature alignment from both semantic-aware and detail-aware perspectives, and then construct a parameterized paired central difference convolution block for flexibly calculating the similarity between the well-aligned query-key features. Besides, we develop a fine-grained neighbor selection strategy on HR features, which is simple yet effective for alleviating mosaic artifacts. Based on these careful designs, we systematically construct a refreshed similarity-based feature upsampling framework named ReSFU. Extensive experiments substantiate that our proposed ReSFU is finely applicable to various types of architectures in a direct high-ratio upsampling manner, and consistently achieves satisfactory performance on different segmentation applications, showing superior generality and ease of deployment.

machine learning, natural language, segmentation, (19 more...)

arXiv.org Artificial Intelligence

2407.02283

Country: Asia > China (0.28)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.46)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Add feedback

Rethinking the Graph Polynomial Filter via Positive and Negative Coupling Analysis

Wen, Haodong, Du, Bodong, Liu, Ruixun, Meng, Deyu, Cao, Xiangyong

arXiv.org Artificial IntelligenceApr-16-2024

Recently, the optimization of polynomial filters within Spectral Graph Neural Networks (GNNs) has emerged as a prominent research focus. Existing spectral GNNs mainly emphasize polynomial properties in filter design, introducing computational overhead and neglecting the integration of crucial graph structure information. We argue that incorporating graph information into basis construction can enhance understanding of polynomial basis, and further facilitate simplified polynomial filter design. Motivated by this, we first propose a Positive and Negative Coupling Analysis (PNCA) framework, where the concepts of positive and negative activation are defined and their respective and mixed effects are analysed. Then, we explore PNCA from the message propagation perspective, revealing the subtle information hidden in the activation process. Subsequently, PNCA is used to analyze the mainstream polynomial filters, and a novel simple basis that decouples the positive and negative activation and fully utilizes graph structure information is designed. Finally, a simple GNN (called GSCNet) is proposed based on the new basis. Experimental results on the benchmark datasets for node classification verify that our GSCNet obtains better or comparable results compared with existing state-of-the-art GNNs while demanding relatively less computational time.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2404.10353

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Semi-supervised Counting via Pixel-by-pixel Density Distribution Modelling

Lin, Hui, Ma, Zhiheng, Ji, Rongrong, Wang, Yaowei, Su, Zhou, Hong, Xiaopeng, Meng, Deyu

arXiv.org Artificial IntelligenceFeb-23-2024

This paper focuses on semi-supervised crowd counting, where only a small portion of the training data are labeled. We formulate the pixel-wise density value to regress as a probability distribution, instead of a single deterministic value. On this basis, we propose a semi-supervised crowd-counting model. Firstly, we design a pixel-wise distribution matching loss to measure the differences in the pixel-wise density distributions between the prediction and the ground truth; Secondly, we enhance the transformer decoder by using density tokens to specialize the forwards of decoders w.r.t. different density intervals; Thirdly, we design the interleaving consistency self-supervised learning mechanism to learn from unlabeled data efficiently. Extensive experiments on four datasets are performed to show that our method clearly outperforms the competitors by a large margin under various labeled ratio settings. Code will be released at https://github.com/LoraLinH/Semi-supervised-Counting-via-Pixel-by-pixel-Density-Distribution-Modelling.

artificial intelligence, crowd counting, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2402.15297

Country: Asia > China (0.93)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Rotation Equivariant Proximal Operator for Deep Unfolding Methods in Image Restoration

Fu, Jiahong, Xie, Qi, Meng, Deyu, Xu, Zongben

arXiv.org Artificial IntelligenceDec-25-2023

The deep unfolding approach has attracted significant attention in computer vision tasks, which well connects conventional image processing modeling manners with more recent deep learning techniques. Specifically, by establishing a direct correspondence between algorithm operators at each implementation step and network modules within each layer, one can rationally construct an almost ``white box'' network architecture with high interpretability. In this architecture, only the predefined component of the proximal operator, known as a proximal network, needs manual configuration, enabling the network to automatically extract intrinsic image priors in a data-driven manner. In current deep unfolding methods, such a proximal network is generally designed as a CNN architecture, whose necessity has been proven by a recent theory. That is, CNN structure substantially delivers the translational invariant image prior, which is the most universally possessed structural prior across various types of images. However, standard CNN-based proximal networks have essential limitations in capturing the rotation symmetry prior, another universal structural prior underlying general images. This leaves a large room for further performance improvement in deep unfolding approaches. To address this issue, this study makes efforts to suggest a high-accuracy rotation equivariant proximal network that effectively embeds rotation symmetry priors into the deep unfolding framework. Especially, we deduce, for the first time, the theoretical equivariant error for such a designed proximal network with arbitrary layers under arbitrary rotation degrees. This analysis should be the most refined theoretical conclusion for such error evaluation to date and is also indispensable for supporting the rationale behind such networks with intrinsic interpretability requirements.

artificial intelligence, deep learning, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2312.15701

Country:

Europe > Germany (0.14)
Europe > France (0.14)
Asia > China (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback