AITopics | Zhang, Chengqi

Collaborating Authors

Zhang, Chengqi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Biologically Plausible Brain Graph Transformer

Peng, Ciyuan, Huang, Yuelong, Dong, Qichao, Yu, Shuo, Xia, Feng, Zhang, Chengqi, Jin, Yaochu

arXiv.org Artificial IntelligenceFeb-12-2025

State-of-the-art brain graph analysis methods fail to fully encode the small-world architecture of brain graphs (accompanied by the presence of hubs and functional modules), and therefore lack biological plausibility to some extent. This limitation hinders their ability to accurately represent the brain's structural and functional properties, thereby restricting the effectiveness of machine learning models in tasks such as brain disorder detection. In this work, we propose a novel Biologically Plausible Brain Graph Transformer (BioBGT) that encodes the small-world architecture inherent in brain graphs. Specifically, we present a network entanglement-based node importance encoding technique that captures the structural importance of nodes in global information propagation during brain graph communication, highlighting the biological properties of the brain structure. Furthermore, we introduce a functional module-aware self-attention to preserve the functional segregation and integration characteristics of brain graphs in the learned representations. Hub2 (a) Hubs play essential roles (b) Functional modules in the brain. One Figure 1: Small-world architecture of brain graphs. of the most important characteristics of brain graphs is their small-world architecture, with scientific evidence supporting the presence of hubs and functional modules in brain graphs (Liao et al., 2017; Swanson et al., 2024). First, it is demonstrated that nodes in brain graphs exhibit a high degree of difference in their importance, with certain nodes having more central roles in information propagation (Lynn & Bassett, 2019; Betzel et al., 2024). These nodes are perceived as hubs, as shown in Figure 1 (a) (the visualization is based on findings by Seguin et al. (2023)), which are usually highly connected so as to support efficient communication within the brain. Second, human brain consists of various functional modules (e.g., visual cortex), where ROIs within the same module exhibit high functional coherence, termed functional integration, while ROIs from different modules show lower functional coherence, termed functional segregation (Rubinov & Sporns, 2010; Seguin et al., 2022). Therefore, brain graphs are characterized by community structure, reflecting functional modules. Our code is available at https://github.com/pcyyyy/BioBGT. ROIs in the same module have strong connections (high temporal correlations), while those from different modules show weaker connections. With the significant ability of graph transformers in capturing interactions between nodes (Ma et al., 2023a; Shehzad et al., 2024; Yi et al., 2024), Transformer-based brain graph learning methods have gained prominence (Kan et al., 2022; Bannadabhavi et al., 2023).

artificial intelligence, graph, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2502.08958

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Federated Foundation Models on Heterogeneous Time Series

Chen, Shengchao, Long, Guodong, Jiang, Jing, Zhang, Chengqi

arXiv.org Artificial IntelligenceDec-11-2024

Training a general-purpose time series foundation models with robust generalization capabilities across diverse applications from scratch is still an open challenge. Efforts are primarily focused on fusing cross-domain time series datasets to extract shared subsequences as tokens for training models on Transformer architecture. However, due to significant statistical heterogeneity across domains, this cross-domain fusing approach doesn't work effectively as the same as fusing texts and images. To tackle this challenge, this paper proposes a novel federated learning approach to address the heterogeneity in time series foundation models training, namely FFTS. Specifically, each data-holding organization is treated as an independent client in a collaborative learning framework with federated settings, and then many client-specific local models will be trained to preserve the unique characteristics per dataset. Moreover, a new regularization mechanism will be applied to both client-side and server-side, thus to align the shared knowledge across heterogeneous datasets from different domains. Extensive experiments on benchmark datasets demonstrate the effectiveness of the proposed federated learning approach. The newly learned time series foundation models achieve superior generalization capabilities on cross-domain time series analysis tasks, including forecasting, imputation, and anomaly detection.

data mining, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2412.08906

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Mind the Gap Between Prototypes and Images in Cross-domain Finetuning

Tian, Hongduan, Liu, Feng, Zhou, Zhanke, Liu, Tongliang, Zhang, Chengqi, Han, Bo

arXiv.org Artificial IntelligenceOct-20-2024

In cross-domain few-shot classification (CFC), recent works mainly focus on adapting a simple transformation head on top of a frozen pre-trained backbone with few labeled data to project embeddings into a task-specific metric space where classification can be performed by measuring similarities between image instance and prototype representations. Technically, an assumption implicitly adopted in such a framework is that the prototype and image instance embeddings share the same representation transformation. However, in this paper, we find that there naturally exists a gap, which resembles the modality gap, between the prototype and image instance embeddings extracted from the frozen pre-trained backbone, and simply applying the same transformation during the adaptation phase constrains exploring the optimal representations and shrinks the gap between prototype and image representations. To solve this problem, we propose a simple yet effective method, contrastive prototype-image adaptation (CoPA), to adapt different transformations respectively for prototypes and images similarly to CLIP by treating prototypes as text prompts. Extensive experiments on Meta-Dataset demonstrate that CoPA achieves the state-of-the-art performance more efficiently. Meanwhile, further analyses also indicate that CoPA can learn better representation clusters, enlarge the gap, and achieve minimal validation loss at the enlarged gap.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.12474

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Personalized Item Representations in Federated Multimodal Recommendation

Li, Zhiwei, Long, Guodong, Jiang, Jing, Zhang, Chengqi

arXiv.org Artificial IntelligenceOct-14-2024

Federated recommendation systems are essential for providing personalized recommendations while protecting user privacy. However, current methods mainly rely on ID-based item embeddings, neglecting the rich multimodal information of items. To address this, we propose a Federated Multimodal Recommendation System, called FedMR. FedMR uses a foundation model on the server to encode multimodal item data, such as images and text. To handle data heterogeneity caused by user preference differences, FedMR introduces a Mixing Feature Fusion Module on each client, which adjusts fusion strategy weights based on user interaction history to generate personalized item representations that capture users' fine-grained preferences. FedMR is compatible with existing ID-based federated recommendation systems, improving performance without modifying the original framework. Experiments on four real-world multimodal datasets demonstrate FedMR's effectiveness. The code is available at https://anonymous.4open.science/r/FedMR.

artificial intelligence, machine learning, recommendation, (17 more...)

arXiv.org Artificial Intelligence

2410.08478

Country:

Oceania > Australia (0.14)
North America > United States (0.14)
Asia (0.14)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

WALL-E: World Alignment by Rule Learning Improves World Model-based LLM Agents

Zhou, Siyu, Zhou, Tianyi, Yang, Yijun, Long, Guodong, Ye, Deheng, Jiang, Jing, Zhang, Chengqi

arXiv.org Artificial IntelligenceOct-11-2024

Step 1-2: the agent makes a plan via MPC with the initial unaligned world model, resulting in a failed action for mining iron ore. Step 3: by comparing real trajectories with the world model predictions, WALL-E learns a critical rule that if the tool is not proper to the material being mined, the action will fail. Step 4-5: the learned rule helps the world model make accurate predictions for transitions that were predicted mistakenly in MPC. Step 6: the agent accordingly modifies its plan and replaces stone pickaxe with an iron pickaxe toward completing the task. Can large language models (LLMs) directly serve as powerful world models for modelbased agents? While the gaps between the prior knowledge of LLMs and the specified environment's dynamics do exist, our study reveals that the gaps can be bridged by aligning an LLM with its deployed environment and such "world alignment" can be efficiently achieved by rule learning on LLMs. Given the rich prior knowledge of LLMs, only a few additional rules suffice to align LLM predictions with the specified environment dynamics. To this end, we propose a neurosymbolic approach to learn these rules gradient-free through LLMs, by inducing, updating, and pruning rules based on comparisons of agent-explored trajectories and world model predictions. Our embodied LLM agent "WALL-E" is built upon model-predictive control (MPC). By optimizing look-ahead actions based on the precise world model, MPC significantly improves exploration and learning efficiency. Compared to existing LLM agents, WALL-E's reasoning only requires a few principal rules rather than verbose buffered trajectories being included in the LLM input. On open-world challenges in Minecraft and ALFWorld, WALL-E achieves higher success rates than existing methods, with lower costs on replanning time and the number of tokens used for reasoning. In Minecraft, WALL-E exceeds baselines by 15-30% in success rate while costing 8-20 fewer replanning rounds and only 60-80% of tokens. This leads to safety risks agent's action per step is controlled by and suboptimality of generated trajectories.

large language model, natural language, world model, (17 more...)

arXiv.org Artificial Intelligence

2410.07484

Genre:

Workflow (1.00)
Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Towards Next-Generation LLM-based Recommender Systems: A Survey and Beyond

Wang, Qi, Li, Jindong, Wang, Shiqi, Xing, Qianli, Niu, Runliang, Kong, He, Li, Rui, Long, Guodong, Chang, Yi, Zhang, Chengqi

arXiv.org Artificial IntelligenceOct-10-2024

Large language models (LLMs) have not only revolutionized the field of natural language processing (NLP) but also have the potential to bring a paradigm shift in many other fields due to their remarkable abilities of language understanding, as well as impressive generalization capabilities and reasoning skills. As a result, recent studies have actively attempted to harness the power of LLMs to improve recommender systems, and it is imperative to thoroughly review the recent advances and challenges of LLM-based recommender systems. Unlike existing work, this survey does not merely analyze the classifications of LLM-based recommendation systems according to the technical framework of LLMs. Instead, it investigates how LLMs can better serve recommendation tasks from the perspective of the recommender system community, thus enhancing the integration of large language models into the research of recommender system and its practical application. In addition, the long-standing gap between academic research and industrial applications related to recommender systems has not been well discussed, especially in the era of large language models. In this review, we introduce a novel taxonomy that originates from the intrinsic essence of recommendation, delving into the application of large language model-based recommendation systems and their industrial implementation. Specifically, we propose a three-tier structure that more accurately reflects the developmental progression of recommendation systems from research to practical implementation, including representing and understanding, scheming and utilizing, and industrial deployment. Furthermore, we discuss critical challenges and opportunities in this emerging field. A more up-to-date version of the papers is maintained at: https://github.com/jindongli-Ai/Next-Generation-LLM-based-Recommender-Systems-Survey.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.19744

Country:

North America > United States (0.45)
Asia > China (0.28)
North America > Mexico (0.28)
Oceania > Australia > New South Wales (0.27)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Influence-oriented Personalized Federated Learning

Tan, Yue, Long, Guodong, Jiang, Jing, Zhang, Chengqi

arXiv.org Artificial IntelligenceOct-4-2024

Traditional federated learning (FL) methods often rely on fixed weighting for parameter aggregation, neglecting the mutual influence by others. Hence, their effectiveness in heterogeneous data contexts is limited. To address this problem, we propose an influence-oriented federated learning framework, namely FedC^2I, which quantitatively measures Client-level and Class-level Influence to realize adaptive parameter aggregation for each client. Our core idea is to explicitly model the inter-client influence within an FL system via the well-crafted influence vector and influence matrix. The influence vector quantifies client-level influence, enables clients to selectively acquire knowledge from others, and guides the aggregation of feature representation layers. Meanwhile, the influence matrix captures class-level influence in a more fine-grained manner to achieve personalized classifier aggregation. We evaluate the performance of FedC^2I against existing federated learning methods under non-IID settings and the results demonstrate the superiority of our method.

artificial intelligence, fedc 2, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2410.03315

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Multi-Level Additive Modeling for Structured Non-IID Federated Learning

Chen, Shutong, Zhou, Tianyi, Long, Guodong, Ma, Jie, Jiang, Jing, Zhang, Chengqi

arXiv.org Artificial IntelligenceMay-26-2024

The primary challenge in Federated Learning (FL) is to model non-IID distributions across clients, whose fine-grained structure is important to improve knowledge sharing. For example, some knowledge is globally shared across all clients, some is only transferable within a subgroup of clients, and some are client-specific. To capture and exploit this structure, we train models organized in a multi-level structure, called ``Multi-level Additive Models (MAM)'', for better knowledge-sharing across heterogeneous clients and their personalization. In federated MAM (FeMAM), each client is assigned to at most one model per level and its personalized prediction sums up the outputs of models assigned to it across all levels. For the top level, FeMAM trains one global model shared by all clients as FedAvg. For every mid-level, it learns multiple models each assigned to a subgroup of clients, as clustered FL. Every bottom-level model is trained for one client only. In the training objective, each model aims to minimize the residual of the additive predictions by the other models assigned to each client. To approximate the arbitrary structure of non-IID across clients, FeMAM introduces more flexibility and adaptivity to FL by incrementally adding new models to the prediction of each client and reassigning another if necessary, automatically optimizing the knowledge-sharing structure. Extensive experiments show that FeMAM surpasses existing clustered FL and personalized FL methods in various non-IID settings. Our code is available at https://github.com/shutong043/FeMAM.

artificial intelligence, femam, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2405.16472

Country:

North America > United States (0.14)
Oceania > Australia (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

Personalized Adapter for Large Meteorology Model on Devices: Towards Weather Foundation Models

Chen, Shengchao, Long, Guodong, Jiang, Jing, Zhang, Chengqi

arXiv.org Artificial IntelligenceMay-24-2024

This paper demonstrates that pre-trained language models (PLMs) are strong foundation models for on-device meteorological variables modeling. We present LM-Weather, a generic approach to taming PLMs, that have learned massive sequential knowledge from the universe of natural language databases, to acquire an immediate capability to obtain highly customized models for heterogeneous meteorological data on devices while keeping high efficiency. Concretely, we introduce a lightweight personalized adapter into PLMs and endows it with weather pattern awareness. During communication between clients and the server, low-rank-based transmission is performed to effectively fuse the global knowledge among devices while maintaining high communication efficiency and ensuring privacy. Experiments on real-wold dataset show that LM-Weather outperforms the state-of-the-art results by a large margin across various tasks (e.g., forecasting and imputation at different scales). We provide extensive and in-depth analyses experiments, which verify that LM-Weather can (1) indeed leverage sequential knowledge from natural language to accurately handle meteorological sequence, (2) allows each devices obtain highly customized models under significant heterogeneity, and (3) generalize under data-limited and out-of-distribution (OOD) scenarios.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2405.20348

Country:

North America > United States (1.00)
Asia > Middle East > Israel > Tel Aviv District (0.14)
Asia > Middle East > Israel > Southern District (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Education (0.67)
Information Technology > Security & Privacy (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

Gradient Transformation: Towards Efficient and Model-Agnostic Unlearning for Dynamic Graph Neural Networks

Zhang, He, Wu, Bang, Yang, Xiangwen, Yuan, Xingliang, Zhang, Chengqi, Pan, Shirui

arXiv.org Artificial IntelligenceMay-23-2024

Graph unlearning has emerged as an essential tool for safeguarding user privacy and mitigating the negative impacts of undesirable data. Meanwhile, the advent of dynamic graph neural networks (DGNNs) marks a significant advancement due to their superior capability in learning from dynamic graphs, which encapsulate spatial-temporal variations in diverse real-world applications (e.g., traffic forecasting). With the increasing prevalence of DGNNs, it becomes imperative to investigate the implementation of dynamic graph unlearning. However, current graph unlearning methodologies are designed for GNNs operating on static graphs and exhibit limitations including their serving in a pre-processing manner and impractical resource demands. Furthermore, the adaptation of these methods to DGNNs presents non-trivial challenges, owing to the distinctive nature of dynamic graphs. To this end, we propose an effective, efficient, model-agnostic, and post-processing method to implement DGNN unlearning. Specifically, we first define the unlearning requests and formulate dynamic graph unlearning in the context of continuous-time dynamic graphs. After conducting a role analysis on the unlearning data, the remaining data, and the target DGNN model, we propose a method called Gradient Transformation and a loss function to map the unlearning request to the desired parameter update. Evaluations on six real-world datasets and state-of-the-art DGNN backbones demonstrate its effectiveness (e.g., limited performance drop even obvious improvement) and efficiency (e.g., at most 7.23$\times$ speed-up) outperformance, and potential advantages in handling future unlearning requests (e.g., at most 32.59$\times$ speed-up).

artificial intelligence, graph, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2405.14407

Country:

Asia > Singapore (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback