AITopics | Zheng, Zaiyi

Collaborating Authors

Zheng, Zaiyi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion

Zhang, Binchi, Zheng, Zaiyi, Chen, Zhengzhang, Li, Jundong

arXiv.org Artificial IntelligenceJan-31-2025

For instance, in a two-layer MLP, permuting the rows of the weight matrix in the first Symmetry in the parameter space of deep neural layer and applying the corresponding inverse permutation to networks (DNNs) has proven beneficial for various the second layer results in a functionally equivalent model, deep learning applications. A well-known i.e., the outputs of the original and permuted models remain example is the permutation symmetry in Multi-identical for any given input (Ainsworth et al., 2023). Layer Perceptrons (MLPs), where permuting the All functionally equivalent models corresponding to weight rows of weight matrices in one layer and applying permutations form an equivalence set, which provides theoretical the inverse permutation to adjacent layers yields a insights into neural network optimization, such as functionally equivalent model. While permutation the linear mode connectivity of loss landscapes (Entezari symmetry fully characterizes the equivalence set et al., 2022; Zhou et al., 2023). In addition, permutation for MLPs, its discrete nature limits its utility for symmetry has also proven helpful in advancing neural network transformers. In this paper, we introduce rotation applications, such as model fusion (Singh & Jaggi, symmetry, a novel form of parameter space symmetry 2020; Ainsworth et al., 2023) and optimization (Zhao et al., for transformers that generalizes permutation 2024).

artificial intelligence, machine learning, symmetry, (13 more...)

arXiv.org Artificial Intelligence

2502.00264

Country: North America > United States > New Mexico (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Resolving Editing-Unlearning Conflicts: A Knowledge Codebook Framework for Large Language Model Updating

Zhang, Binchi, Chen, Zhengzhang, Zheng, Zaiyi, Li, Jundong, Chen, Haifeng

arXiv.org Artificial IntelligenceJan-31-2025

Large Language Models (LLMs) excel in natural language processing by encoding extensive human knowledge, but their utility relies on timely updates as knowledge evolves. Updating LLMs involves two key tasks simultaneously: unlearning to remove unwanted knowledge and editing to incorporate new information. Existing methods face two major challenges: ineffective knowledge storage (either too sparse or too dense) and task conflicts between editing and unlearning, as validated through our theoretical and experimental results. To address these issues, we propose LOKA, a conflict-free framework for LLM updating based on a knowledge codebook. During training, updated knowledge is stored in multiple codebook memories. To optimize knowledge storage, a similarity-aware knowledge mapping ensures that related knowledge pieces are clustered and allocated to the same memory. Additionally, LOKA resolves task conflicts by employing task-specific and multi-task memories guided by a conflict score. In the inference stage, LOKA retrieves the most relevant memory from the codebook and plugs it into the original LLM to apply the updated knowledge. A learning-based router controls codebook activation to further improve knowledge utilization. Extensive experiments demonstrate the effectiveness of LOKA in LLM knowledge updating tasks.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2502.00158

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine (0.46)
Information Technology > Security & Privacy (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

KG-CF: Knowledge Graph Completion with Context Filtering under the Guidance of Large Language Models

Zheng, Zaiyi, Dong, Yushun, Wang, Song, Liu, Haochen, Wang, Qi, Li, Jundong

arXiv.org Artificial IntelligenceJan-5-2025

Large Language Models (LLMs) have shown impressive performance in various tasks, including knowledge graph completion (KGC). However, current studies mostly apply LLMs to classification tasks, like identifying missing triplets, rather than ranking-based tasks, where the model ranks candidate entities based on plausibility. This focus limits the practical use of LLMs in KGC, as real-world applications prioritize highly plausible triplets. Additionally, while graph paths can help infer the existence of missing triplets and improve completion accuracy, they often contain redundant information. To address these issues, we propose KG-CF, a framework tailored for ranking-based KGC tasks. KG-CF leverages LLMs' reasoning abilities to filter out irrelevant contexts, achieving superior results on real-world datasets. The code and datasets are available at \url{https://anonymous.4open.science/r/KG-CF}.

large language model, natural language, triplet, (13 more...)

arXiv.org Artificial Intelligence

2501.02711

Country: North America > United States > Virginia (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Explaining Graph Neural Networks with Large Language Models: A Counterfactual Perspective for Molecular Property Prediction

He, Yinhan, Zheng, Zaiyi, Soga, Patrick, Zhu, Yaozhen, Dong, yushun, Li, Jundong

arXiv.org Artificial IntelligenceOct-19-2024

In recent years, Graph Neural Networks (GNNs) have become successful in molecular property prediction tasks such as toxicity analysis. However, due to the black-box nature of GNNs, their outputs can be concerning in high-stakes decision-making scenarios, e.g., drug discovery. Facing such an issue, Graph Counterfactual Explanation (GCE) has emerged as a promising approach to improve GNN transparency. However, current GCE methods usually fail to take domain-specific knowledge into consideration, which can result in outputs that are not easily comprehensible by humans. To address this challenge, we propose a novel GCE method, LLM-GCE, to unleash the power of large language models (LLMs) in explaining GNNs for molecular property prediction. Specifically, we utilize an autoencoder to generate the counterfactual graph topology from a set of counterfactual text pairs (CTPs) based on an input graph. Meanwhile, we also incorporate a CTP dynamic feedback module to mitigate LLM hallucination, which provides intermediate feedback derived from the generated counterfactuals as an attempt to give more faithful guidance. Extensive experiments demonstrate the superior performance of LLM-GCE. Our code is released on https://github.com/YinhanHe123/new\_LLM4GNNExplanation.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.15165

Country: North America > United States > Virginia (0.28)

Genre: Research Report > Promising Solution (0.48)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.50)
Health & Medicine > Therapeutic Area > Immunology (0.50)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

A Benchmark for Fairness-Aware Graph Learning

Dong, Yushun, Wang, Song, Lei, Zhenyu, Zheng, Zaiyi, Ma, Jing, Chen, Chen, Li, Jundong

arXiv.org Artificial IntelligenceJul-16-2024

Fairness-aware graph learning has gained increasing attention in recent years. Nevertheless, there lacks a comprehensive benchmark to evaluate and compare different fairness-aware graph learning methods, which blocks practitioners from choosing appropriate ones for broader real-world applications. In this paper, we present an extensive benchmark on ten representative fairness-aware graph learning methods. Specifically, we design a systematic evaluation protocol and conduct experiments on seven real-world datasets to evaluate these methods from multiple perspectives, including group fairness, individual fairness, the balance between different fairness criteria, and computational efficiency. Our in-depth analysis reveals key insights into the strengths and limitations of existing methods. Additionally, we provide practical guidance for applying fairness-aware graph learning methods in applications. To the best of our knowledge, this work serves as an initial step towards comprehensively understanding representative fairness-aware graph learning methods to facilitate future advancements in this area.

artificial intelligence, fairness-aware graph learning, machine learning

arXiv.org Artificial Intelligence

2407.12112

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Knowledge Editing for Large Language Models: A Survey

Wang, Song, Zhu, Yaochen, Liu, Haochen, Zheng, Zaiyi, Chen, Chen, Li, Jundong

arXiv.org Artificial IntelligenceDec-14-2023

Large language models (LLMs) have recently transformed both the academic and industrial landscapes due to their remarkable capacity to understand, analyze, and generate texts based on their vast knowledge and reasoning ability. Nevertheless, one major drawback of LLMs is their substantial computational cost for pre-training due to their unprecedented amounts of parameters. The disadvantage is exacerbated when new knowledge frequently needs to be introduced into the pre-trained model. Therefore, it is imperative to develop effective and efficient techniques to update pre-trained LLMs. Traditional methods encode new knowledge in pre-trained LLMs through direct fine-tuning. However, naively re-training LLMs can be computationally intensive and risks degenerating valuable pre-trained knowledge irrelevant to the update in the model. Recently, Knowledge-based Model Editing (KME) has attracted increasing attention, which aims to precisely modify the LLMs to incorporate specific knowledge, without negatively influencing other irrelevant knowledge. In this survey, we aim to provide a comprehensive and in-depth overview of recent advances in the field of KME. We first introduce a general formulation of KME to encompass different KME strategies. Afterward, we provide an innovative taxonomy of KME techniques based on how the new knowledge is introduced into pre-trained LLMs, and investigate existing KME strategies while analyzing key insights, advantages, and limitations of methods from each category. Moreover, representative metrics, datasets, and applications of KME are introduced accordingly. Finally, we provide an in-depth analysis regarding the practicality and remaining challenges of KME and suggest promising research directions for further advancement in this field.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2310.16218

Country: North America > United States > Virginia > Albemarle County > Charlottesville (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Law (1.00)
Media (0.67)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback