AITopics | Feng, Yan

Collaborating Authors

Feng, Yan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Uncertainty-Aware Graph Structure Learning

Han, Shen, Zhou, Zhiyao, Chen, Jiawei, Hao, Zhezheng, Zhou, Sheng, Wang, Gang, Feng, Yan, Chen, Chun, Wang, Can

arXiv.org Artificial IntelligenceFeb-18-2025

Graph Neural Networks (GNNs) have become a prominent approach for learning from graph-structured data. However, their effectiveness can be significantly compromised when the graph structure is suboptimal. To address this issue, Graph Structure Learning (GSL) has emerged as a promising technique that refines node connections adaptively. Nevertheless, we identify two key limitations in existing GSL methods: 1) Most methods primarily focus on node similarity to construct relationships, while overlooking the quality of node information. Blindly connecting low-quality nodes and aggregating their ambiguous information can degrade the performance of other nodes. 2) The constructed graph structures are often constrained to be symmetric, which may limit the model's flexibility and effectiveness. To overcome these limitations, we propose an Uncertainty-aware Graph Structure Learning (UnGSL) strategy. UnGSL estimates the uncertainty of node information and utilizes it to adjust the strength of directional connections, where the influence of nodes with high uncertainty is adaptively reduced.Importantly, UnGSL serves as a plug-in module that can be seamlessly integrated into existing GSL methods with minimal additional computational cost. In our experiments, we implement UnGSL into six representative GSL methods, demonstrating consistent performance improvements. The code is available at https://github.com/UnHans/UnGSL.

artificial intelligence, machine learning, node, (16 more...)

arXiv.org Artificial Intelligence

2502.12618

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report > New Finding (0.88)

Industry: Information Technology (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.84)

Add feedback

Efficiently Achieving Secure Model Training and Secure Aggregation to Ensure Bidirectional Privacy-Preservation in Federated Learning

Yang, Xue, Peng, Depan, Feng, Yan, Tang, Xiaohu, Fang, Weijun, Shao, Jun

arXiv.org Artificial IntelligenceDec-16-2024

Bidirectional privacy-preservation federated learning is crucial as both local gradients and the global model may leak privacy. However, only a few works attempt to achieve it, and they often face challenges such as excessive communication and computational overheads, or significant degradation of model accuracy, which hinders their practical applications. In this paper, we design an efficient and high-accuracy bidirectional privacy-preserving scheme for federated learning to complete secure model training and secure aggregation. To efficiently achieve bidirectional privacy, we design an efficient and accuracy-lossless model perturbation method on the server side (called $\mathbf{MP\_Server}$) that can be combined with local differential privacy (LDP) to prevent clients from accessing the model, while ensuring that the local gradients obtained on the server side satisfy LDP. Furthermore, to ensure model accuracy, we customize a distributed differential privacy mechanism on the client side (called $\mathbf{DDP\_Client}$). When combined with $\mathbf{MP\_Server}$, it ensures LDP of the local gradients, while ensuring that the aggregated result matches the accuracy of central differential privacy (CDP). Extensive experiments demonstrate that our scheme significantly outperforms state-of-the-art bidirectional privacy-preservation baselines (SOTAs) in terms of computational cost, model accuracy, and defense ability against privacy attacks. Particularly, given target accuracy, the training time of SOTAs is approximately $200$ times, or even over $1000$ times, longer than that of our scheme. When the privacy budget is set relatively small, our scheme incurs less than $6\%$ accuracy loss compared to the privacy-ignoring method, while SOTAs suffer up to $20\%$ accuracy loss. Experimental results also show that the defense capability of our scheme outperforms than SOTAs.

artificial intelligence, global model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2412.11737

Country: Asia > China (0.67)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

PSL: Rethinking and Improving Softmax Loss from Pairwise Perspective for Recommendation

Yang, Weiqin, Chen, Jiawei, Xin, Xin, Zhou, Sheng, Hu, Binbin, Feng, Yan, Chen, Chun, Wang, Can

arXiv.org Artificial IntelligenceOct-31-2024

Softmax Loss (SL) is widely applied in recommender systems (RS) and has demonstrated effectiveness. This work analyzes SL from a pairwise perspective, revealing two significant limitations: 1) the relationship between SL and conventional ranking metrics like DCG is not sufficiently tight; 2) SL is highly sensitive to false negative instances. Our analysis indicates that these limitations are primarily due to the use of the exponential function. To address these issues, this work extends SL to a new family of loss functions, termed Pairwise Softmax Loss (PSL), which replaces the exponential function in SL with other appropriate activation functions. While the revision is minimal, we highlight three merits of PSL: 1) it serves as a tighter surrogate for DCG with suitable activation functions; 2) it better balances data contributions; and 3) it acts as a specific BPR loss enhanced by Distributionally Robust Optimization (DRO).

artificial intelligence, machine learning, psl, (19 more...)

arXiv.org Artificial Intelligence

2411.00163

Country:

North America > United States (0.14)
Asia > China > Zhejiang Province (0.14)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.88)
(2 more...)

Add feedback

Sparse Prototype Network for Explainable Pedestrian Behavior Prediction

Feng, Yan, Carballo, Alexander, Takeda, Kazuya

arXiv.org Artificial IntelligenceOct-15-2024

Predicting pedestrian behavior is challenging yet crucial for applications such as autonomous driving and smart city. Recent deep learning models have achieved remarkable performance in making accurate predictions, but they fail to provide explanations of their inner workings. One reason for this problem is the multi-modal inputs. To bridge this gap, we present Sparse Prototype Network (SPN), an explainable method designed to simultaneously predict a pedestrian's future action, trajectory, and pose. SPN leverages an intermediate prototype bottleneck layer to provide sample-based explanations for its predictions. The prototypes are modality-independent, meaning that they can correspond to any modality from the input. Therefore, SPN can extend to arbitrary combinations of modalities. Regularized by mono-semanticity and clustering constraints, the prototypes learn consistent and human-understandable features and achieve state-of-the-art performance on action, trajectory and pose prediction on TITAN and PIE. Finally, we propose a metric named Top-K Mono-semanticity Scale to quantitatively evaluate the explainability. Qualitative results show the positive correlation between sparsity and explainability. Code available at https://github.com/Equinoxxxxx/SPN.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.12195

Genre: Research Report > New Finding (0.48)

Industry: Transportation > Ground > Road (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Towards Dynamic Graph Neural Networks with Provably High-Order Expressive Power

Wang, Zhe, Zhao, Tianjian, Zhang, Zhen, Chen, Jiawei, Zhou, Sheng, Feng, Yan, Chen, Chun, Wang, Can

arXiv.org Artificial IntelligenceOct-2-2024

Dynamic Graph Neural Networks (DyGNNs) have garnered increasing research attention for learning representations on evolving graphs. Despite their effectiveness, the limited expressive power of existing DyGNNs hinders them from capturing important evolving patterns of dynamic graphs. Although some works attempt to enhance expressive capability with heuristic features, there remains a lack of DyGNN frameworks with provable and quantifiable high-order expressive power. To address this research gap, we firstly propose the k-dimensional Dynamic WL tests (k-DWL) as the referencing algorithms to quantify the expressive power of DyGNNs. We demonstrate that the expressive power of existing DyGNNs is upper bounded by the 1-DWL test. To enhance the expressive power, we propose Dynamic Graph Neural Network with High-order expressive power (HopeDGN), which updates the representation of central node pair by aggregating the interaction history with neighboring node pairs. Our theoretical results demonstrate that HopeDGN can achieve expressive power equivalent to the 2-DWL test. We then present a Transformer-based implementation for the local variant of HopeDGN. Experimental results show that HopeDGN achieved performance improvements of up to 3.12%, demonstrating the effectiveness of HopeDGN.

artificial intelligence, expressive power, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.01367

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Motif-driven Subgraph Structure Learning for Graph Classification

Zhou, Zhiyao, Zhou, Sheng, Mao, Bochao, Chen, Jiawei, Sun, Qingyun, Feng, Yan, Chen, Chun, Wang, Can

arXiv.org Artificial IntelligenceJun-13-2024

To mitigate the suboptimal nature of graph structure, Graph Structure Learning (GSL) has emerged as a promising approach to improve graph structure and boost performance in downstream tasks. Despite the proposal of numerous GSL methods, the progresses in this field mostly concentrated on node-level tasks, while graph-level tasks (e.g., graph classification) remain largely unexplored. Notably, applying node-level GSL to graph classification is non-trivial due to the lack of find-grained guidance for intricate structure learning. Inspired by the vital role of subgraph in graph classification, in this paper we explore the potential of subgraph structure learning for graph classification by tackling the challenges of key subgraph selection and structure optimization. We propose a novel Motif-driven Subgraph Structure Learning method for Graph Classification (MOSGSL). Specifically, MOSGSL incorporates a subgraph structure learning module which can adaptively select important subgraphs. A motif-driven structure guidance module is further introduced to capture key subgraph-level structural patterns (motifs) and facilitate personalized structure learning. Extensive experiments demonstrate a significant and consistent improvement over baselines, as well as its flexibility and generalizability for various backbones and learning procedures.

artificial intelligence, machine learning, subgraph, (14 more...)

arXiv.org Artificial Intelligence

2406.08897

Country:

North America > United States (0.28)
Asia > China > Zhejiang Province (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)

Add feedback

How Do Recommendation Models Amplify Popularity Bias? An Analysis from the Spectral Perspective

Lin, Siyi, Gao, Chongming, Chen, Jiawei, Zhou, Sheng, Hu, Binbin, Feng, Yan, Chen, Chun, Wang, Can

arXiv.org Artificial IntelligenceJun-13-2024

Recommendation Systems (RS) are often plagued by popularity bias. When training a recommendation model on a typically long-tailed dataset, the model tends to not only inherit this bias but often exacerbate it, resulting in over-representation of popular items in the recommendation lists. This study conducts comprehensive empirical and theoretical analyses to expose the root causes of this phenomenon, yielding two core insights: 1) Item popularity is memorized in the principal spectrum of the score matrix predicted by the recommendation model; 2) The dimension collapse phenomenon amplifies the relative prominence of the principal spectrum, thereby intensifying the popularity bias. Building on these insights, we propose a novel debiasing strategy that leverages a spectral norm regularizer to penalize the magnitude of the principal singular value. We have developed an efficient algorithm to expedite the calculation of the spectral norm by exploiting the spectral property of the score matrix. Extensive experiments across seven real-world datasets and three testing paradigms have been conducted to validate the superiority of the proposed method.

artificial intelligence, machine learning, popularity bias, (16 more...)

arXiv.org Artificial Intelligence

2404.12008

Country:

Asia > China (0.29)
Europe (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Confidence-aware Self-Semantic Distillation on Knowledge Graph Embedding

Liu, Yichen, Chen, Jiawei, Chen, Defang, Zhou, Zhehui, Feng, Yan, Wang, Can

arXiv.org Artificial IntelligenceMay-27-2024

Knowledge Graph Embedding (KGE), which projects entities and relations into continuous vector spaces, have garnered significant attention. Although high-dimensional KGE methods offer better performance, they come at the expense of significant computation and memory overheads. Decreasing embedding dimensions significantly deteriorates model performance. While several recent efforts utilize knowledge distillation or non-Euclidean representation learning to augment the effectiveness of low-dimensional KGE, they either necessitate a pre-trained high-dimensional teacher model or involve complex non-Euclidean operations, thereby incurring considerable additional computational costs. To address this, this work proposes Confidence-aware Self-Knowledge Distillation (CSD) that learns from model itself to enhance KGE in a low-dimensional space. Specifically, CSD extracts knowledge from embeddings in previous iterations, which would be utilized to supervise the learning of the model in the next iterations. Moreover, a specific semantic module is developed to filter reliable knowledge by estimating the confidence of previously learned embeddings. This straightforward strategy bypasses the need for time-consuming pre-training of teacher models and can be integrated into various KGE methods to improve their performance. Our comprehensive experiments on six KGE backbones and four datasets underscore the effectiveness of the proposed CSD.

distillation, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2206.02963

Country: North America > United States (0.30)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

WildfireGPT: Tailored Large Language Model for Wildfire Analysis

Xie, Yangxinyu, Mallick, Tanwi, Bergerson, Joshua David, Hutchison, John K., Verner, Duane R., Branham, Jordan, Alexander, M. Ross, Ross, Robert B., Feng, Yan, Levy, Leslie-Anne, Su, Weijie

arXiv.org Artificial IntelligenceFeb-12-2024

Understanding and adapting to climate change is paramount for professionals such as urban planners, emergency managers, and infrastructure operators, as it directly influences urban development, disaster response, and the maintenance of essential services. Nonetheless, this task presents a complex challenge that necessitates the integration of advanced technology and scientific insights. Recent advances in LLMs present an innovative solution, particularly in democratizing climate science. They possess the unique capability to interpret and explain technical aspects of climate change through conversations, making this crucial information accessible to people from all backgrounds Rillig et al. [2023], Bulian et al. [2023], Chen et al. [2023]. However, given that LLMs are generalized models, their performance can be improved by providing additional domain-specific information. Recent research has been focusing on augmenting LLMs with external tools and data sources to ensure that the information provided is scientifically accurate: for example, leveraging authoritative data sources such as ClimateWatch Kraus et al. [2023] and findings from the IPCC AR6 reports Vaghefi et al. [2023] helps in refining the LLM's outputs, ensuring that the information is grounded in the latest research.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2402.07877

Country: North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Genre: Research Report > New Finding (0.92)

Industry:

Law Enforcement & Public Safety > Fire & Emergency Services (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Water & Waste Management > Water Management > Water Supplies & Services (0.71)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Knowledge Translation: A New Pathway for Model Compression

Sun, Wujie, Chen, Defang, Chen, Jiawei, Feng, Yan, Chen, Chun, Wang, Can

arXiv.org Artificial IntelligenceJan-11-2024

Deep learning has witnessed significant advancements in recent years at the cost of increasing training, inference, and model storage overhead. While existing model compression methods strive to reduce the number of model parameters while maintaining high accuracy, they inevitably necessitate the re-training of the compressed model or impose architectural constraints. To overcome these limitations, this paper presents a novel framework, termed \textbf{K}nowledge \textbf{T}ranslation (KT), wherein a ``translation'' model is trained to receive the parameters of a larger model and generate compressed parameters. The concept of KT draws inspiration from language translation, which effectively employs neural networks to convert different languages, maintaining identical meaning. Accordingly, we explore the potential of neural networks to convert models of disparate sizes, while preserving their functionality. We propose a comprehensive framework for KT, introduce data augmentation strategies to enhance model performance despite restricted training data, and successfully demonstrate the feasibility of KT on the MNIST dataset. Code is available at \url{https://github.com/zju-SWJ/KT}.

artificial intelligence, machine learning, survey article, (17 more...)

arXiv.org Artificial Intelligence

2401.05772

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback