AITopics | Zhang, Hongwei

Collaborating Authors

Zhang, Hongwei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CoKV: Optimizing KV Cache Allocation via Cooperative Game

Sun, Qiheng, Zhang, Hongwei, Xia, Haocheng, Zhang, Jiayao, Liu, Jinfei, Ren, Kui

arXiv.org Artificial IntelligenceFeb-21-2025

Large language models (LLMs) have achieved remarkable success on various aspects of human life. However, one of the major challenges in deploying these models is the substantial memory consumption required to store key-value pairs (KV), which imposes significant resource demands. Recent research has focused on KV cache budget allocation, with several approaches proposing head-level budget distribution by evaluating the importance of individual attention heads. These methods, however, assess the importance of heads independently, overlooking their cooperative contributions within the model, which may result in a deviation from their true impact on model performance. In light of this limitation, we propose CoKV, a novel method that models the cooperation between heads in model inference as a cooperative game. By evaluating the contribution of each head within the cooperative game, CoKV can allocate the cache budget more effectively. Extensive experiments show that CoKV achieves state-of-the-art performance on the LongBench benchmark using LLama-3-8B-Instruct and Mistral-7B models.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2502.17501

Country:

Asia (0.67)
North America > United States > Illinois (0.14)
Europe > Austria > Vienna (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Learning-Enhanced Safeguard Control for High-Relative-Degree Systems: Robust Optimization under Disturbances and Faults

Wang, Xinyang, Zhang, Hongwei, Wang, Shimin, Xiao, Wei, Guay, Martin

arXiv.org Artificial IntelligenceJan-25-2025

Merely pursuing performance may adversely affect the safety, while a conservative policy for safe exploration will degrade the performance. How to balance the safety and performance in learning-based control problems is an interesting yet challenging issue. This paper aims to enhance system performance with safety guarantee in solving the reinforcement learning (RL)-based optimal control problems of nonlinear systems subject to high-relative-degree state constraints and unknown time-varying disturbance/actuator faults. First, to combine control barrier functions (CBFs) with RL, a new type of CBFs, termed high-order reciprocal control barrier function (HO-RCBF) is proposed to deal with high-relative-degree constraints during the learning process. Then, the concept of gradient similarity is proposed to quantify the relationship between the gradient of safety and the gradient of performance. Finally, gradient manipulation and adaptive mechanisms are introduced in the safe RL framework to enhance the performance with a safety guarantee. Two simulation examples illustrate that the proposed safe RL framework can address high-relative-degree constraint, enhance safety robustness and improve system performance.

constraint, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2501.15373

Country:

Asia > China (0.68)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.81)

Industry:

Transportation (0.68)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Control Systems (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Add feedback

How Collective Intelligence Emerges in a Crowd of People Through Learned Division of Labor: A Case Study

Wang, Dekun, Zhang, Hongwei

arXiv.org Artificial IntelligenceJan-21-2025

This paper investigates the factors fostering collective intelligence (CI) through a case study of *LinYi's Experiment, where over 2000 human players collectively controll an avatar car. By conducting theoretical analysis and replicating observed behaviors through numerical simulations, we demonstrate how self-organized division of labor (DOL) among individuals fosters the emergence of CI and identify two essential conditions fostering CI by formulating this problem into a stability problem of a Markov Jump Linear System (MJLS). These conditions, independent of external stimulus, emphasize the importance of both elite and common players in fostering CI. Additionally, we propose an index for emergence of CI and a distributed method for estimating joint actions, enabling individuals to learn their optimal social roles without global action information of the whole crowd.

artificial intelligence, baseline case, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/SMC54092.2024.10831045

2501.12587

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Games (0.48)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Collaboration (0.71)

Add feedback

SNR-EQ-JSCC: Joint Source-Channel Coding with SNR-Based Embedding and Query

Zhang, Hongwei, Tao, Meixia

arXiv.org Artificial IntelligenceJan-6-2025

Coping with the impact of dynamic channels is a critical issue in joint source-channel coding (JSCC)-based semantic communication systems. In this paper, we propose a lightweight channel-adaptive semantic coding architecture called SNR-EQ-JSCC. It is built upon the generic Transformer model and achieves channel adaptation (CA) by Embedding the signal-to-noise ratio (SNR) into the attention blocks and dynamically adjusting attention scores through channel-adaptive Queries. Meanwhile, penalty terms are introduced in the loss function to stabilize the training process. Considering that instantaneous SNR feedback may be imperfect, we propose an alternative method that uses only the average SNR, which requires no retraining of SNR-EQ-JSCC. Simulation results conducted on image transmission demonstrate that the proposed SNR-EQJSCC outperforms the state-of-the-art SwinJSCC in peak signal-to-noise ratio (PSNR) and perception metrics while only requiring 0.05% of the storage overhead and 6.38% of the computational complexity for CA. Moreover, the channel-adaptive query method demonstrates significant improvements in perception metrics. When instantaneous SNR feedback is imperfect, SNR-EQ-JSCC using only the average SNR still surpasses baseline schemes.

artificial intelligence, machine learning, snr-eq-jscc, (13 more...)

arXiv.org Artificial Intelligence

2501.04732

Country: Asia > China (0.15)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning to Rank for Maps at Airbnb

Haldar, Malay, Zhang, Hongwei, Bellare, Kedar, Chen, Sherry, Banerjee, Soumyadip, Wang, Xiaotang, Abdool, Mustafa, Gao, Huiji, Tapadia, Pavan, He, Liwei, Katariya, Sanjeev

arXiv.org Artificial IntelligenceJun-25-2024

As a two-sided marketplace, Airbnb brings together hosts who own listings for rent with prospective guests from around the globe. Results from a guest's search for listings are displayed primarily through two interfaces: (1) as a list of rectangular cards that contain on them the listing image, price, rating, and other details, referred to as list-results (2) as oval pins on a map showing the listing price, called map-results. Both these interfaces, since their inception, have used the same ranking algorithm that orders listings by their booking probabilities and selects the top listings for display. But some of the basic assumptions underlying ranking, built for a world where search results are presented as lists, simply break down for maps. This paper describes how we rebuilt ranking for maps by revising the mathematical foundations of how users interact with search results. Our iterative and experiment-driven approach led us through a path full of twists and turns, ending in a unified theory for the two interfaces. Our journey shows how assumptions taken for granted when designing machine learning algorithms may not apply equally across all user interfaces, and how they can be adapted. The net impact was one of the largest improvements in user experience for Airbnb which we discuss as a series of experimental validations.

artificial intelligence, listing, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2407.00091

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.64)

Industry: Consumer Products & Services > Hotels (0.83)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

StructComp: Substituting propagation with Structural Compression in Training Graph Contrastive Learning

Zhang, Shengzhong, Yang, Wenjie, Cao, Xinyuan, Zhang, Hongwei, Huang, Zengfeng

arXiv.org Artificial IntelligenceDec-8-2023

Graph contrastive learning (GCL) has become a powerful tool for learning graph data, but its scalability remains a significant challenge. In this work, we propose a simple yet effective training framework called Structural Compression (StructComp) to address this issue. Inspired by a sparse low-rank approximation on the diffusion matrix, StructComp trains the encoder with the compressed nodes. This allows the encoder not to perform any message passing during the training stage, and significantly reduces the number of sample pairs in the contrastive loss. We theoretically prove that the original GCL loss can be approximated with the contrastive loss computed by StructComp. Moreover, StructComp can be regarded as an additional regularization term for GCL models, resulting in a more robust encoder. Empirical studies on seven benchmark datasets show that StructComp greatly reduces the time and memory consumption while improving model performance compared to the vanilla GCL models and scalable training methods.

artificial intelligence, machine learning, structcomp, (16 more...)

arXiv.org Artificial Intelligence

2312.04865

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Understanding Community Bias Amplification in Graph Representation Learning

Zhang, Shengzhong, Yang, Wenjie, Zhang, Yimin, Zhang, Hongwei, Yan, Divin, Huang, Zengfeng

arXiv.org Artificial IntelligenceDec-8-2023

In this work, we discover a phenomenon of community bias amplification in graph representation learning, which refers to the exacerbation of performance bias between different classes by graph representation learning. We conduct an in-depth theoretical study of this phenomenon from a novel spectral perspective. Our analysis suggests that structural bias between communities results in varying local convergence speeds for node embeddings. This phenomenon leads to bias amplification in the classification results of downstream tasks. Based on the theoretical insights, we propose random graph coarsening, which is proved to be effective in dealing with the above issue. Finally, we propose a novel graph contrastive learning model called Random Graph Coarsening Contrastive Learning (RGCCL), which utilizes random coarsening as data augmentation and mitigates community bias by contrasting the coarsened graph with the original graph. Extensive experiments on various datasets demonstrate the advantage of our method when dealing with community bias amplification.

artificial intelligence, graph, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2312.04883

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.55)

Add feedback

Learning nonlinear dynamics in synchronization of knowledge-based leader-following networks

Wang, Shimin, Meng, Xiangyu, Zhang, Hongwei, Lewis, Frank L.

arXiv.org Artificial IntelligenceDec-29-2021

Knowledge-based leader-following synchronization problem of heterogeneous nonlinear multi-agent systems is challenging since the leader's dynamic information is unknown to all follower nodes. This paper proposes a learning-based fully distributed observer for a class of nonlinear leader systems, which can simultaneously learn the leader's dynamics and states. The class of leader dynamics considered here does not require a bounded Jacobian matrix. Based on this learning-based distributed observer, we further synthesize an adaptive distributed control law for solving the leader-following synchronization problem of multiple Euler-Lagrange systems subject to an uncertain nonlinear leader system. The results are illustrated by a simulation example.

artificial intelligence, observer, synchronization, (16 more...)

arXiv.org Artificial Intelligence

2112.14676

Country:

North America > Canada > Alberta (0.28)
North America > United States > Louisiana > East Baton Rouge Parish > Baton Rouge (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

AdaL: Adaptive Gradient Transformation Contributes to Convergences and Generalizations

Zhang, Hongwei, Zou, Weidong, Zhao, Hongbo, Ming, Qi, Yan, Tijin, Xia, Yuanqing, Cao, Weipeng

arXiv.org Artificial IntelligenceJul-3-2021

Adaptive optimization methods have been widely used in deep learning. They scale the learning rates adaptively according to the past gradient, which has been shown to be effective to accelerate the convergence. However, they suffer from poor generalization performance compared with SGD. Recent studies point that smoothing exponential gradient noise leads to generalization degeneration phenomenon. Inspired by this, we propose AdaL, with a transformation on the original gradient. AdaL accelerates the convergence by amplifying the gradient in the early stage, as well as dampens the oscillation and stabilizes the optimization by shrinking the gradient later. Such modification alleviates the smoothness of gradient noise, which produces better generalization performance. We have theoretically proved the convergence of AdaL and demonstrated its effectiveness on several benchmarks.

adal, deep learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

2107.01525

Country:

North America > Canada (0.14)
Asia > China (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

Add feedback

ScoreGrad: Multivariate Probabilistic Time Series Forecasting with Continuous Energy-based Generative Models

Yan, Tijin, Zhang, Hongwei, Zhou, Tong, Zhan, Yufeng, Xia, Yuanqing

arXiv.org Machine LearningJun-18-2021

Multivariate time series prediction has attracted a lot of attention because of its wide applications such as intelligence transportation, AIOps. Generative models have achieved impressive results in time series modeling because they can model data distribution and take noise into consideration. However, many existing works can not be widely used because of the constraints of functional form of generative models or the sensitivity to hyperparameters. In this paper, we propose ScoreGrad, a multivariate probabilistic time series forecasting framework based on continuous energy-based generative models. ScoreGrad is composed of time series feature extraction module and conditional stochastic differential equation based score matching module. The prediction can be achieved by iteratively solving reverse-time SDE. To the best of our knowledge, ScoreGrad is the first continuous energy based generative model used for time series forecasting. Furthermore, ScoreGrad achieves state-of-the-art results on six real-world datasets. The impact of hyperparameters and sampler types on the performance are also explored. Code is available at https://github.com/yantijin/ScoreGradPred.

continuous energy-based generative model, data mining, multivariate probabilistic time series forecasting, (2 more...)

arXiv.org Machine Learning

2106.10121

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)

Add feedback