AITopics | He, Yu

Collaborating Authors

He, Yu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Label-Only Membership Inference Attack against Pre-trained Large Language Models

He, Yu, Li, Boheng, Liu, Liu, Ba, Zhongjie, Dong, Wei, Li, Yiming, Qin, Zhan, Ren, Kui, Chen, Chun

arXiv.org Artificial IntelligenceFeb-26-2025

Membership Inference Attacks (MIAs) aim to predict whether a data sample belongs to the model's training set or not. Although prior research has extensively explored MIAs in Large Language Models (LLMs), they typically require accessing to complete output logits (\ie, \textit{logits-based attacks}), which are usually not available in practice. In this paper, we study the vulnerability of pre-trained LLMs to MIAs in the \textit{label-only setting}, where the adversary can only access generated tokens (text). We first reveal that existing label-only MIAs have minor effects in attacking pre-trained LLMs, although they are highly effective in inferring fine-tuning datasets used for personalized LLMs. We find that their failure stems from two main reasons, including better generalization and overly coarse perturbation. Specifically, due to the extensive pre-training corpora and exposing each sample only a few times, LLMs exhibit minimal robustness differences between members and non-members. This makes token-level perturbations too coarse to capture such differences. To alleviate these problems, we propose \textbf{PETAL}: a label-only membership inference attack based on \textbf{PE}r-\textbf{T}oken sem\textbf{A}ntic simi\textbf{L}arity. Specifically, PETAL leverages token-level semantic similarity to approximate output probabilities and subsequently calculate the perplexity. It finally exposes membership based on the common assumption that members are `better' memorized and have smaller perplexity. We conduct extensive experiments on the WikiMIA benchmark and the more challenging MIMIR benchmark. Empirically, our PETAL performs better than the extensions of existing label-only attacks against personalized LLMs and even on par with other advanced logit-based attacks across all metrics on five prevalent open-source LLMs.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.18943

Country:

North America > United States (0.14)
North America > Canada (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ResoFilter: Fine-grained Synthetic Data Filtering for Large Language Models through Data-Parameter Resonance Analysis

Tu, Zeao, Meng, Xiangdi, He, Yu, Yao, Zihan, Qi, Tianyu, Liu, Jun, Li, Ming

arXiv.org Artificial IntelligenceDec-19-2024

Large language models (LLMs) have shown remarkable effectiveness across various domains, with data augmentation methods utilizing GPT for synthetic data generation becoming prevalent. However, the quality and utility of augmented data remain questionable, and current methods lack clear metrics for evaluating data characteristics. To address these challenges, we propose ResoFilter, a novel method that integrates models, data, and tasks to refine datasets. ResoFilter leverages the fine-tuning process to obtain Data-Parameter features for data selection, offering improved interpretability by representing data characteristics through model weights. Our experiments demonstrate that ResoFilter achieves comparable results to full-scale fine-tuning using only half the data in mathematical tasks and exhibits strong generalization across different models and domains. This method provides valuable insights for constructing synthetic datasets and evaluating high-quality data, offering a promising solution for enhancing data augmentation techniques and improving training dataset quality for LLMs. For reproducibility, we will release our code and data upon acceptance.

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2412.14809

Country: Asia > China (0.28)

Genre:

Research Report > Promising Solution (0.68)
Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

FedCAda: Adaptive Client-Side Optimization for Accelerated and Stable Federated Learning

Zhou, Liuzhi, He, Yu, Zhai, Kun, Liu, Xiang, Liu, Sen, Ma, Xingjun, Ye, Guangnan, Jiang, Yu-Gang, Chai, Hongfeng

arXiv.org Artificial IntelligenceMay-20-2024

Federated learning (FL) has emerged as a prominent approach for collaborative training of machine learning models across distributed clients while preserving data privacy. However, the quest to balance acceleration and stability becomes a significant challenge in FL, especially on the client-side. In this paper, we introduce FedCAda, an innovative federated client adaptive algorithm designed to tackle this challenge. FedCAda leverages the Adam algorithm to adjust the correction process of the first moment estimate $m$ and the second moment estimate $v$ on the client-side and aggregate adaptive algorithm parameters on the server-side, aiming to accelerate convergence speed and communication efficiency while ensuring stability and performance. Additionally, we investigate several algorithms incorporating different adjustment functions. This comparative analysis revealed that due to the limited information contained within client models from other clients during the initial stages of federated learning, more substantial constraints need to be imposed on the parameters of the adaptive algorithm. As federated learning progresses and clients gather more global information, FedCAda gradually diminishes the impact on adaptive parameters. These findings provide insights for enhancing the robustness and efficiency of algorithmic improvements. Through extensive experiments on computer vision (CV) and natural language processing (NLP) datasets, we demonstrate that FedCAda outperforms the state-of-the-art methods in terms of adaptability, convergence, stability, and overall performance. This work contributes to adaptive algorithms for federated learning, encouraging further exploration.

artificial intelligence, federated learning, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2405.11811

Country: North America > United States (0.28)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology > Security & Privacy (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Scalable Model Editing via Customized Expert Networks

Yao, Zihan, He, Yu, Qi, Tianyu, Li, Ming

arXiv.org Artificial IntelligenceApr-3-2024

Addressing the issue of hallucinations and outdated knowledge in large language models is critical for their reliable application. Model Editing presents a promising avenue for mitigating these challenges in a cost-effective manner. However, existing methods often suffer from unsatisfactory generalization and unintended effects on unrelated samples. To overcome these limitations, we introduce a novel approach: Scalable Model Editing via Customized Expert Networks (SCEN), which is a two-stage continuous training paradigm. Specifically, in the first stage, we train lightweight expert networks individually for each piece of knowledge that needs to be updated. Subsequently, we train a corresponding neuron for each expert to control the activation state of that expert. Our experiments on two different sizes of open-source large language models, the Llama2 7B and 13B, achieve state-of-the-art results compared to existing mainstream Model Editing methods. Our code is available at https: //github.com/TAL-auroraX/SCEN

artificial intelligence, customized expert network, large language model, (2 more...)

arXiv.org Artificial Intelligence

2404.02699

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.44)

Add feedback

Facility Location Games with Scaling Effects

He, Yu, Lam, Alexander, Li, Minming

arXiv.org Artificial IntelligenceFeb-29-2024

We take the classic facility location problem and consider a variation, in which each agent's individual cost function is equal to their distance from the facility multiplied by a scaling factor which is determined by the facility placement. In addition to the general class of continuous scaling functions, we also provide results for piecewise linear scaling functions which can effectively approximate or model the scaling of many real world scenarios. We focus on the objectives of total and maximum cost, describing the computation of the optimal solution. We then move to the approximate mechanism design setting, observing that the agents' preferences may no longer be single-peaked. Consequently, we characterize the conditions on scaling functions which ensure that agents have single-peaked preferences. Under these conditions, we find results on the total and maximum cost approximation ratios achievable by strategyproof and anonymous mechanisms.

artificial intelligence, mechanism, optimization problem, (15 more...)

arXiv.org Artificial Intelligence

2402.18908

Country: Asia (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Add feedback

VFLAIR: A Research Library and Benchmark for Vertical Federated Learning

Zou, Tianyuan, Gu, Zixuan, He, Yu, Takahashi, Hideaki, Liu, Yang, Ye, Guangnan, Zhang, Ya-Qin

arXiv.org Artificial IntelligenceOct-15-2023

Vertical Federated Learning (VFL) has emerged as a collaborative training paradigm that allows participants with different features of the same group of users to accomplish cooperative training without exposing their raw data or model parameters. VFL has gained significant attention for its research potential and real-world applications in recent years, but still faces substantial challenges, such as in defending various kinds of data inference and backdoor attacks. Moreover, most of existing VFL projects are industry-facing and not easily used for keeping track of the current research progress. To address this need, we present an extensible and lightweight VFL framework VFLAIR (available at https://github.com/FLAIR-THU/VFLAIR), which supports VFL training with a variety of models, datasets and protocols, along with standardized modules for comprehensive evaluations of attacks and defense strategies. We also benchmark 11 attacks and 8 defenses performance under different communication and model partition settings and draw concrete insights and recommendations on the choice of defense strategies for different practical VFL deployment scenario.

artificial intelligence, dataset, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2310.09827

Country:

Asia > China (0.28)
North America > United States (0.28)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.83)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

MER 2023: Multi-label Learning, Modality Robustness, and Semi-Supervised Learning

Lian, Zheng, Sun, Haiyang, Sun, Licai, Chen, Kang, Xu, Mingyu, Wang, Kexin, Xu, Ke, He, Yu, Li, Ying, Zhao, Jinming, Liu, Ye, Liu, Bin, Yi, Jiangyan, Wang, Meng, Cambria, Erik, Zhao, Guoying, Schuller, Björn W., Tao, Jianhua

arXiv.org Artificial IntelligenceSep-14-2023

The first Multimodal Emotion Recognition Challenge (MER 2023) was successfully held at ACM Multimedia. The challenge focuses on system robustness and consists of three distinct tracks: (1) MER-MULTI, where participants are required to recognize both discrete and dimensional emotions; (2) MER-NOISE, in which noise is added to test videos for modality robustness evaluation; (3) MER-SEMI, which provides a large amount of unlabeled samples for semi-supervised learning. In this paper, we introduce the motivation behind this challenge, describe the benchmark dataset, and provide some statistics about participants. To continue using this dataset after MER 2023, please sign a new End User License Agreement and send it to our official email address merchallenge.contact@gmail.com. We believe this high-quality dataset can become a new benchmark in multimodal emotion recognition, especially for the Chinese research community.

artificial intelligence, machine learning, proceedings, (16 more...)

arXiv.org Artificial Intelligence

2304.08981

Country:

North America (0.70)
Asia > China (0.54)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Unsupervised Seismic Footprint Removal With Physical Prior Augmented Deep Autoencoder

Qian, Feng, Yue, Yuehua, He, Yu, Yu, Hongtao, Zhou, Yingjie, Tang, Jinliang, Hu, Guangmin

arXiv.org Artificial IntelligenceFeb-8-2023

Seismic acquisition footprints appear as stably faint and dim structures and emerge fully spatially coherent, causing inevitable damage to useful signals during the suppression process. Various footprint removal methods, including filtering and sparse representation (SR), have been reported to attain promising results for surmounting this challenge. However, these methods, e.g., SR, rely solely on the handcrafted image priors of useful signals, which is sometimes an unreasonable demand if complex geological structures are contained in the given seismic data. As an alternative, this article proposes a footprint removal network (dubbed FR-Net) for the unsupervised suppression of acquired footprints without any assumptions regarding valuable signals. The key to the FR-Net is to design a unidirectional total variation (UTV) model for footprint acquisition according to the intrinsically directional property of noise. By strongly regularizing a deep convolutional autoencoder (DCAE) using the UTV model, our FR-Net transforms the DCAE from an entirely data-driven model to a \textcolor{black}{prior-augmented} approach, inheriting the superiority of the DCAE and our footprint model. Subsequently, the complete separation of the footprint noise and useful signals is projected in an unsupervised manner, specifically by optimizing the FR-Net via the backpropagation (BP) algorithm. We provide qualitative and quantitative evaluations conducted on three synthetic and field datasets, demonstrating that our FR-Net surpasses the previous state-of-the-art (SOTA) methods.

artificial intelligence, footprint, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TGRS.2023.3277973

2302.10756

Country:

Asia > China (0.29)
North America (0.28)

Genre: Research Report (0.50)

Industry:

Energy > Oil & Gas > Upstream (1.00)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Continuous Neural Algorithmic Planners

He, Yu, Veličković, Petar, Liò, Pietro, Deac, Andreea

arXiv.org Artificial IntelligenceNov-28-2022

Neural algorithmic reasoning studies the problem of learning algorithms with neural networks, especially with graph architectures. A recent proposal, XLVIN, reaps the benefits of using a graph neural network that simulates the value iteration algorithm in deep reinforcement learning agents. It allows model-free planning without access to privileged information about the environment, which is usually unavailable. However, XLVIN only supports discrete action spaces, and is hence nontrivially applicable to most tasks of real-world interest. We expand XLVIN to continuous action spaces by discretization, and evaluate several selective expansion policies to deal with the large planning graphs. Our proposal, CNAP, demonstrates how neural algorithmic reasoning can make a measurable impact in higher-dimensional continuous control settings, such as MuJoCo, bringing gains in low-data settings and outperforming model-free baselines.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2211.15839

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

HeteSpaceyWalk: A Heterogeneous Spacey Random Walk for Heterogeneous Information Network Embedding

He, Yu, Song, Yangqiu, Li, Jianxin, Ji, Cheng, Peng, Jian, Peng, Hao

arXiv.org Machine LearningSep-7-2019

Heterogeneous information network (HIN) embedding has gained increasing interests recently. However, the current way of random-walk based HIN embedding methods have paid few attention to the higher-order Markov chain nature of meta-path guided random walks, especially to the stationarity issue. In this paper, we systematically formalize the meta-path guided random walk as a higher-order Markov chain process, and present a heterogeneous personalized spacey random walk to efficiently and effectively attain the expected stationary distribution among nodes. Then we propose a generalized scalable framework to leverage the heterogeneous personalized spacey random walk to learn embeddings for multiple types of nodes in an HIN guided by a meta-path, a meta-graph, and a meta-schema respectively. We conduct extensive experiments in several heterogeneous networks and demonstrate that our methods substantially outperform the existing state-of-the-art network embedding algorithms.

information management, random walk, survey article, (20 more...)

arXiv.org Machine Learning

1909.03228

Country:

Asia > China (0.28)
North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Information Management (0.94)

Add feedback