AITopics | Wang, Qiang

Collaborating Authors

Wang, Qiang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learn by Reasoning: Analogical Weight Generation for Few-Shot Class-Incremental Learning

Han, Jizhou, Ding, Chenhao, He, Yuhang, Dong, Songlin, Wang, Qiang, Gao, Xinyuan, Gong, Yihong

arXiv.org Artificial IntelligenceMar-27-2025

Few-shot class-incremental Learning (FSCIL) enables models to learn new classes from limited data while retaining performance on previously learned classes. Traditional FSCIL methods often require fine-tuning parameters with limited new class data and suffer from a separation between learning new classes and utilizing old knowledge. Inspired by the analogical learning mechanisms of the human brain, we propose a novel analogical generative method. Our approach includes the Brain-Inspired Analogical Generator (BiAG), which derives new class weights from existing classes without parameter fine-tuning during incremental stages. BiAG consists of three components: Weight Self-Attention Module (WSA), Weight & Prototype Analogical Attention Module (WPAA), and Semantic Conversion Module (SCM). SCM uses Neural Collapse theory for semantic conversion, WSA supplements new class weights, and WPAA computes analogies to generate new class weights. Experiments on miniImageNet, CUB-200, and CIFAR-100 datasets demonstrate that our method achieves higher final and average accuracy compared to SOTA methods.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.21258

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Analogical Reasoning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

CPRM: A LLM-based Continual Pre-training Framework for Relevance Modeling in Commercial Search

Wu, Kaixin, Ji, Yixin, Chen, Zeyuan, Wang, Qiang, Wang, Cunxiang, Liu, Hong, Ji, Baijun, Xu, Jia, Liu, Zhongyi, Gu, Jinjie, Zhou, Yuan, Mo, Linjian

arXiv.org Artificial IntelligenceDec-8-2024

Relevance modeling between queries and items stands as a pivotal component in commercial search engines, directly affecting the user experience. Given the remarkable achievements of large language models (LLMs) in various natural language processing (NLP) tasks, LLM-based relevance modeling is gradually being adopted within industrial search systems. Nevertheless, foundational LLMs lack domain-specific knowledge and do not fully exploit the potential of in-context learning. Furthermore, structured item text remains underutilized, and there is a shortage in the supply of corresponding queries and background knowledge. We thereby propose CPRM (Continual Pre-training for Relevance Modeling), a framework designed for the continual pre-training of LLMs to address these issues. Our CPRM framework includes three modules: 1) employing both queries and multi-field item to jointly pre-train for enhancing domain knowledge, 2) applying in-context pre-training, a novel approach where LLMs are pre-trained on a sequence of related queries or items, and 3) conducting reading comprehension on items to produce associated domain knowledge and background information (e.g., generating summaries and corresponding queries) to further strengthen LLMs. Results on offline experiments and online A/B testing demonstrate that our model achieves convincing performance compared to strong baselines.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.01269

Country: Asia > Middle East > Israel > Mediterranean Sea (0.34)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SPTTE: A Spatiotemporal Probabilistic Framework for Travel Time Estimation

Xu, Chen, Wang, Qiang, Sun, Lijun

arXiv.org Artificial IntelligenceNov-27-2024

Accurate travel time estimation is essential for navigation and itinerary planning. While existing research employs probabilistic modeling to assess travel time uncertainty and account for correlations between multiple trips, modeling the temporal variability of multi-trip travel time distributions remains a significant challenge. Capturing the evolution of joint distributions requires large, well-organized datasets; however, real-world trip data are often temporally sparse and spatially unevenly distributed. To address this issue, we propose SPTTE, a spatiotemporal probabilistic framework that models the evolving joint distribution of multi-trip travel times by formulating the estimation task as a spatiotemporal stochastic process regression problem with fragmented observations. SPTTE incorporates an RNN-based temporal Gaussian process parameterization to regularize sparse observations and capture temporal dependencies. Additionally, it employs a prior-based heterogeneity smoothing strategy to correct unreliable learning caused by unevenly distributed trips, effectively modeling temporal variability under sparse and uneven data distributions. Evaluations on real-world datasets demonstrate that SPTTE outperforms state-of-the-art deterministic and probabilistic methods by over 10.13%. Ablation studies and visualizations further confirm the effectiveness of the model components.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2411.18484

Country:

Asia > China (0.47)
North America > Canada > Quebec (0.28)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Infrastructure & Services (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science > Data Mining (0.94)

Add feedback

Domain Consistency Representation Learning for Lifelong Person Re-Identification

Liu, Shiben, Wang, Qiang, Fan, Huijie, Ren, Weihong, Fan, Baojie, Tang, Yandong

arXiv.org Artificial IntelligenceNov-19-2024

Lifelong person re-identification (LReID) exhibits a contradictory relationship between intra-domain discrimination and inter-domain gaps when learning from continuous data. Intra-domain discrimination focuses on individual nuances (e.g. clothing type, accessories, etc.), while inter-domain gaps emphasize domain consistency. Achieving a trade-off between maximizing intra-domain discrimination and minimizing inter-domain gaps is a crucial challenge for improving LReID performance. Most existing methods aim to reduce inter-domain gaps through knowledge distillation to maintain domain consistency. However, they often ignore intra-domain discrimination. To address this challenge, we propose a novel domain consistency representation learning (DCR) model that explores global and attribute-wise representations as a bridge to balance intra-domain discrimination and inter-domain gaps. At the intra-domain level, we explore the complementary relationship between global and attribute-wise representations to improve discrimination among similar identities. Excessive learning intra-domain discrimination can lead to catastrophic forgetting. We further develop an attribute-oriented anti-forgetting (AF) strategy that explores attribute-wise representations to enhance inter-domain consistency, and propose a knowledge consolidation (KC) strategy to facilitate knowledge transfer. Extensive experiments show that our DCR model achieves superior performance compared to state-of-the-art LReID methods. Our code will be available soon.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2409.19954

Country: Asia > China (0.69)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

LOBG:Less Overfitting for Better Generalization in Vision-Language Model

Ding, Chenhao, Gao, Xinyuan, Dong, Songlin, He, Yuhang, Wang, Qiang, Kot, Alex, Gong, Yihong

arXiv.org Artificial IntelligenceOct-27-2024

Existing prompt learning methods in Vision-Language Models (VLM) have effectively enhanced the transfer capability of VLM to downstream tasks, but they suffer from a significant decline in generalization due to severe overfitting. To address this issue, we propose a framework named LOBG for vision-language models. Specifically, we use CLIP to filter out fine-grained foreground information that might cause overfitting, thereby guiding prompts with basic visual concepts. To further mitigate overfitting, we devel oped a structural topology preservation (STP) loss at the feature level, which endows the feature space with overall plasticity, allowing effective reshaping of the feature space during optimization. Additionally, we employed hierarchical logit distilation (HLD) at the output level to constrain outputs, complementing STP at the output end. Extensive experimental results demonstrate that our method significantly improves generalization capability and alleviates overfitting compared to state-of-the-art approaches.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2410.10247

Country: Europe > Switzerland (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression

Tang, Zhenheng, Kang, Xueze, Yin, Yiming, Pan, Xinglin, Wang, Yuxin, He, Xin, Wang, Qiang, Zeng, Rongfei, Zhao, Kaiyong, Shi, Shaohuai, Zhou, Amelie Chi, Li, Bo, He, Bingsheng, Chu, Xiaowen

arXiv.org Artificial IntelligenceOct-16-2024

To alleviate hardware scarcity in training large deep neural networks (DNNs), particularly large language models (LLMs), we present FusionLLM, a decentralized training system designed and implemented for training DNNs using geo-distributed GPUs across different computing clusters or individual devices. Decentralized training faces significant challenges regarding system design and efficiency, including: 1) the need for remote automatic differentiation (RAD), 2) support for flexible model definitions and heterogeneous software, 3) heterogeneous hardware leading to low resource utilization or the straggler problem, and 4) slow network communication. To address these challenges, in the system design, we represent the model as a directed acyclic graph of operators (OP-DAG). Each node in the DAG represents the operator in the DNNs, while the edge represents the data dependency between operators. Based on this design, 1) users are allowed to customize any DNN without caring low-level operator implementation; 2) we enable the task scheduling with the more fine-grained sub-tasks, offering more optimization space; 3) a DAG runtime executor can implement RAD withour requiring the consistent low-level ML framework versions. To enhance system efficiency, we implement a workload estimator and design an OP-Fence scheduler to cluster devices with similar bandwidths together and partition the DAG to increase throughput. Additionally, we propose an AdaTopK compressor to adaptively compress intermediate activations and gradients at the slowest communication links. To evaluate the convergence and efficiency of our system and algorithms, we train ResNet-101 and GPT-2 on three real-world testbeds using 48 GPUs connected with 8 Mbps~10 Gbps networks. Experimental results demonstrate that our system and method can achieve 1.45 - 9.39x speedup compared to baseline methods while ensuring convergence.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.12707

Country:

Asia > China (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology > Security & Privacy (1.00)
Energy > Oil & Gas (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

TAS: Distilling Arbitrary Teacher and Student via a Hybrid Assistant

Li, Guopeng, Wang, Qiang, Yan, Ke, Ding, Shouhong, Gao, Yuan, Xia, Gui-Song

arXiv.org Artificial IntelligenceOct-16-2024

Most knowledge distillation (KD) methodologies predominantly focus on teacherstudent pairs with similar architectures, such as both being convolutional neural networks (CNNs). However, the potential and flexibility of KD can be greatly improved by expanding it to novel Cross-Architecture KD (CAKD), where the knowledge of homogeneous and heterogeneous teachers can be transferred flexibly to a given student. The primary challenge in CAKD lies in the substantial feature gaps between heterogeneous models, originating from the distinction of their inherent inductive biases and module functions. To this end, we introduce an assistant model as a bridge to facilitate smooth feature knowledge transfer between heterogeneous teachers and students. More importantly, within our proposed design principle, the assistant model combines the advantages of cross-architecture inductive biases and module functions by merging convolution and attention modules derived from both student and teacher module functions. Furthermore, we observe that heterogeneous features exhibit diverse spatial distributions in CAKD, hindering the effectiveness of conventional pixel-wise mean squared error (MSE) loss. Therefore, we leverage a spatial-agnostic InfoNCE loss to align features after spatial smoothing, thereby improving the feature alignments in CAKD. Our proposed method is evaluated across some homogeneous model pairs and arbitrary heterogeneous combinations of CNNs, ViTs, and MLPs, achieving state-of-the-art performance for distilled models with a maximum gain of 11.47% on CIFAR-100 and 3.67% on ImageNet-1K. Our code and models will be released. Knowledge Distillation (KD) (Hinton et al., 2015; Romero et al., 2015) has been demonstrated as a powerful method to transfer knowledge from a pre-trained and cumbersome teacher model to a compact and efficient student model. Compared to the model trained from scratch, the performance of the student model distilled by appropriate teachers usually improves significantly.

artificial intelligence, distillation, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2410.12342

Genre: Research Report (0.64)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

LPZero: Language Model Zero-cost Proxy Search from Zero

Dong, Peijie, Li, Lujun, Liu, Xiang, Tang, Zhenheng, Liu, Xuebo, Wang, Qiang, Chu, Xiaowen

arXiv.org Artificial IntelligenceOct-7-2024

In spite of the outstanding performance, Neural Architecture Search (NAS) is criticized for massive computation. Recently, Zero-shot NAS has emerged as a promising approach by exploiting Zero-cost (ZC) proxies, which markedly reduce computational demands. Despite this, existing ZC proxies heavily rely on expert knowledge and incur significant trial-and-error costs. Particularly in NLP tasks, most existing ZC proxies fail to surpass the performance of the naive baseline. To address these challenges, we introduce a novel framework, \textbf{LPZero}, which is the first to automatically design ZC proxies for various tasks, achieving higher ranking consistency than human-designed proxies. Specifically, we model the ZC proxy as a symbolic equation and incorporate a unified proxy search space that encompasses existing ZC proxies, which are composed of a predefined set of mathematical symbols. To heuristically search for the best ZC proxy, LPZero incorporates genetic programming to find the optimal symbolic composition. We propose a \textit{Rule-based Pruning Strategy (RPS),} which preemptively eliminates unpromising proxies, thereby mitigating the risk of proxy degradation. Extensive experiments on FlexiBERT, GPT-2, and LLaMA-7B demonstrate LPZero's superior ranking ability and performance on downstream tasks compared to current approaches.

evolutionary algorithm, large language model, machine learning, (23 more...)

arXiv.org Artificial Intelligence

2410.04808

Country:

North America > United States (0.28)
Asia > China (0.28)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.89)

Add feedback

Link Representation Learning for Probabilistic Travel Time Estimation

Xu, Chen, Wang, Qiang, Sun, Lijun

arXiv.org Machine LearningJul-8-2024

Travel time estimation is a crucial application in navigation apps and web mapping services. Current deterministic and probabilistic methods primarily focus on modeling individual trips, assuming independence among trips. However, in real-world scenarios, we often observe strong inter-trip correlations due to factors such as weather conditions, traffic management, and road works. In this paper, we propose to model trip-level link travel time using a Gaussian hierarchical model, which can characterize both inter-trip and intra-trip correlations. The joint distribution of travel time of multiple trips becomes a multivariate Gaussian parameterized by learnable link representations. To effectively use the sparse GPS trajectories, we also propose a data augmentation method based on trip sub-sampling, which allows for fine-grained gradient backpropagation in learning link representations. During inference, we estimate the probability distribution of the travel time of a queried trip conditional on the completed trips that are spatiotemporally adjacent. We refer to the overall framework as ProbTTE. We evaluate ProbTTE on two real-world GPS trajectory datasets, and the results demonstrate its superior performance compared to state-of-the-art deterministic and probabilistic baselines. Additionally, we find that the learned link representations align well with the physical geometry of the network, making them suitable as input for other applications.

artificial intelligence, machine learning, travel time, (20 more...)

arXiv.org Machine Learning

2407.05895

Country:

Asia > China (0.47)
North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Transportation > Ground > Road (0.95)
Transportation > Passenger (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Is Your HD Map Constructor Reliable under Sensor Corruptions?

Hao, Xiaoshuai, Wei, Mengchuan, Yang, Yifan, Zhao, Haimei, Zhang, Hui, Zhou, Yi, Wang, Qiang, Li, Weiming, Kong, Lingdong, Zhang, Jing

arXiv.org Artificial IntelligenceJun-18-2024

Driving systems often rely on high-definition (HD) maps for precise environmental information, which is crucial for planning and navigation. While current HD map constructors perform well under ideal conditions, their resilience to real-world challenges, e.g., adverse weather and sensor failures, is not well understood, raising safety concerns. This work introduces MapBench, the first comprehensive benchmark designed to evaluate the robustness of HD map construction methods against various sensor corruptions. Our benchmark encompasses a total of 29 types of corruptions that occur from cameras and LiDAR sensors. Extensive evaluations across 31 HD map constructors reveal significant performance degradation of existing methods under adverse weather conditions and sensor failures, underscoring critical safety concerns. We identify effective strategies for enhancing robustness, including innovative approaches that leverage multi-modal fusion, advanced data augmentation, and architectural techniques. These insights provide a pathway for developing more reliable HD map construction methods, which are essential for the advancement of autonomous driving technology. The benchmark toolkit and affiliated code and model checkpoints have been made publicly accessible.

artificial intelligence, corruption, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2406.12214

Country: Asia > China (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Transportation > Ground > Road (0.49)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(3 more...)

Add feedback