AITopics | Wang, Dequan

Collaborating Authors

Wang, Dequan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MedForge: Building Medical Foundation Models Like Open Source Software Development

Tan, Zheling, Ding, Kexin, Gao, Jin, Zhou, Mu, Metaxas, Dimitris, Zhang, Shaoting, Wang, Dequan

arXiv.org Artificial IntelligenceFeb-21-2025

Foundational models (FMs) have made significant strides in the healthcare domain. Yet the data silo challenge and privacy concern remain in healthcare systems, hindering safe medical data sharing and collaborative model development among institutions. The collection and curation of scalable clinical datasets increasingly become the bottleneck for training strong FMs. In this study, we propose Medical Foundation Models Merging (MedForge), a cooperative framework enabling a community-driven medical foundation model development, meanwhile preventing the information leakage of raw patient data and mitigating synchronization model development issues across clinical institutions. MedForge offers a bottom-up model construction mechanism by flexibly merging task-specific Low-Rank Adaptation (LoRA) modules, which can adapt to downstream tasks while retaining original model parameters. Through an asynchronous LoRA module integration scheme, the resulting composite model can progressively enhance its comprehensive performance on various clinical tasks. MedForge shows strong performance on multiple clinical datasets (e.g., breast cancer, lung cancer, and colon cancer) collected from different institutions. Our major findings highlight the value of collaborative foundation models in advancing multi-center clinical collaboration effectively and cohesively. Our code is publicly available at https://github.com/TanZheling/MedForge.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.16055

Country:

Asia (0.28)
North America > United States (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Software (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Security & Privacy (1.00)
(5 more...)

Add feedback

Intelligent Computing Social Modeling and Methodological Innovations in Political Science in the Era of Large Language Models

Wang, Zhenyu, Xu, Yi, Wang, Dequan, Zhou, Lingfeng, Zhou, Yiqi

arXiv.org Artificial IntelligenceOct-7-2024

The recent wave of artificial intelligence, epitomized by large language models (LLMs), has presented opportunities and challenges for methodological innovation in political science, sparking discussions on a potential paradigm shift in the social sciences. However, how can we understand the impact of LLMs on knowledge production and paradigm transformation in the social sciences from a comprehensive perspective that integrates technology and methodology? What are LLMs' specific applications and representative innovative methods in political science research? These questions, particularly from a practical methodological standpoint, remain underexplored. This paper proposes the "Intelligent Computing Social Modeling" (ICSM) method to address these issues by clarifying the critical mechanisms of LLMs. ICSM leverages the strengths of LLMs in idea synthesis and action simulation, advancing intellectual exploration in political science through "simulated social construction" and "simulation validation." By simulating the U.S. presidential election, this study empirically demonstrates the operational pathways and methodological advantages of ICSM. By integrating traditional social science paradigms, ICSM not only enhances the quantitative paradigm's capability to apply big data to assess the impact of factors but also provides qualitative paradigms with evidence for social mechanism discovery at the individual level, offering a powerful tool that balances interpretability and predictability in social science research. The findings suggest that LLMs will drive methodological innovation in political science through integration and improvement rather than direct substitution.

large language model, machine learning, simulation, (18 more...)

arXiv.org Artificial Intelligence

2410.16301

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)
Research Report > Promising Solution (0.87)

Industry:

Government > Voting & Elections (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Direct Preference Knowledge Distillation for Large Language Models

Li, Yixing, Gu, Yuxian, Dong, Li, Wang, Dequan, Cheng, Yu, Wei, Furu

arXiv.org Artificial IntelligenceJun-28-2024

In the field of large language models (LLMs), Knowledge Distillation (KD) is a critical technique for transferring capabilities from teacher models to student models. However, existing KD methods face limitations and challenges in distillation of LLMs, including efficiency and insufficient measurement capabilities of traditional KL divergence. It is shown that LLMs can serve as an implicit reward function, which we define as a supplement to KL divergence. In this work, we propose Direct Preference Knowledge Distillation (DPKD) for LLMs. DPKD utilizes distribution divergence to represent the preference loss and implicit reward function. We re-formulate KD of LLMs into two stages: first optimizing and objective consisting of implicit reward and reverse KL divergence and then improving the preference probability of teacher outputs over student outputs. We conducted experiments and analysis on various datasets with LLM parameters ranging from 120M to 13B and demonstrate the broad applicability and effectiveness of our DPKD approach. Meanwhile, we prove the value and effectiveness of the introduced implicit reward and output preference in KD through experiments and theoretical analysis. The DPKD method outperforms the baseline method in both output response precision and exact match percentage. Code and data are available at https://aka.ms/dpkd.

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2406.19774

Country:

North America (0.46)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

mmPlace: Robust Place Recognition with Intermediate Frequency Signal of Low-cost Single-chip Millimeter Wave Radar

Meng, Chengzhen, Duan, Yifan, He, Chenming, Wang, Dequan, Fan, Xiaoran, Zhang, Yanyong

arXiv.org Artificial IntelligenceMar-7-2024

Place recognition is crucial for tasks like loop-closure detection and re-localization. Single-chip millimeter wave radar (single-chip radar in short) emerges as a low-cost sensor option for place recognition, with the advantage of insensitivity to degraded visual environments. However, it encounters two challenges. Firstly, sparse point cloud from single-chip radar leads to poor performance when using current place recognition methods, which assume much denser data. Secondly, its performance significantly declines in scenarios involving rotational and lateral variations, due to limited overlap in its field of view (FOV). We propose mmPlace, a robust place recognition system to address these challenges. Specifically, mmPlace transforms intermediate frequency (IF) signal into range azimuth heatmap and employs a spatial encoder to extract features. Additionally, to improve the performance in scenarios involving rotational and lateral variations, mmPlace employs a rotating platform and concatenates heatmaps in a rotation cycle, effectively expanding the system's FOV. We evaluate mmPlace's performance on the milliSonic dataset, which is collected on the University of Science and Technology of China (USTC) campus, the city roads surrounding the campus, and an underground parking garage. The results demonstrate that mmPlace outperforms point cloud-based methods and achieves 87.37% recall@1 in scenarios involving rotational and lateral variations.

artificial intelligence, heatmap, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2403.04703

Country: Asia > China (0.34)

Genre: Research Report (0.84)

Industry:

Information Technology > Services (0.35)
Transportation > Ground > Road (0.35)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Robots (0.71)

Add feedback

Data-Centric Foundation Models in Computational Healthcare: A Survey

Zhang, Yunkun, Gao, Jin, Tan, Zheling, Zhou, Lingfeng, Ding, Kexin, Zhou, Mu, Zhang, Shaoting, Wang, Dequan

arXiv.org Artificial IntelligenceJan-4-2024

In computational healthcare [3, 72], FMs can handle a variety of clinical data with their appealing capabilities in logical reasoning and semantic understanding. Examples span fields in medical conversation [241, 316], patient health profiling [48], and treatment planning [192]. Moreover, given the strength in largescale data processing, FMs offer a shifting paradigm to assess real-world clinical data in the healthcare workflow rapidly and effectively [208, 261]. FM research places a sharp focus on the data-centric perspective [318]. First, FMs demonstrate the power of scale, where the enlarged model and data size permit FMs to capture vast amounts of information, thus increasing the pressing need of training data quantity [272]. Second, FMs encourage homogenization [21] as evidenced by their extensive adaptability to downstream tasks. High-quality data for FM training thus becomes critical since it can impact the performance of both pre-trained FM and downstream models. Therefore, addressing key data challenges is progressively recognized as a research priority.

bioinformatics, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2401.02458

Country:

Europe (1.00)
North America > United States > North Carolina (0.14)
Asia > Middle East > Republic of Türkiye (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
(6 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Information Management (1.00)
(8 more...)

Add feedback

BayesDiff: Estimating Pixel-wise Uncertainty in Diffusion via Bayesian Inference

Kou, Siqi, Gan, Lei, Wang, Dequan, Li, Chongxuan, Deng, Zhijie

arXiv.org Artificial IntelligenceOct-17-2023

Diffusion models have impressive image generation capability, but low-quality generations still exist, and their identification remains challenging due to the lack of a proper sample-wise metric. To address this, we propose BayesDiff, a pixel-wise uncertainty estimator for generations from diffusion models based on Bayesian inference. In particular, we derive a novel uncertainty iteration principle to characterize the uncertainty dynamics in diffusion, and leverage the last-layer Laplace approximation for efficient Bayesian inference. The estimated pixel-wise uncertainty can not only be aggregated into a sample-wise metric to filter out low-fidelity images but also aids in augmenting successful generations and rectifying artifacts in failed generations in text-to-image tasks. Extensive experiments demonstrate the efficacy of BayesDiff and its promise for practical applications.

artificial intelligence, bayesian inference, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2310.11142

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.34)
Health & Medicine > Therapeutic Area > Immunology (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Text-guided Foundation Model Adaptation for Pathological Image Classification

Zhang, Yunkun, Gao, Jin, Zhou, Mu, Wang, Xiaosong, Qiao, Yu, Zhang, Shaoting, Wang, Dequan

arXiv.org Artificial IntelligenceJul-27-2023

The recent surge of foundation models in computer vision and natural language processing opens up perspectives in utilizing multi-modal clinical data to train large models with strong generalizability. Yet pathological image datasets often lack biomedical text annotation and enrichment. Guiding data-efficient image diagnosis from the use of biomedical text knowledge becomes a substantial interest. In this paper, we propose to Connect Image and Text Embeddings (CITE) to enhance pathological image classification. CITE injects text insights gained from language models pre-trained with a broad range of biomedical texts, leading to adapt foundation models towards pathological image understanding. Through extensive experiments on the PatchGastric stomach tumor pathological image dataset, we demonstrate that CITE achieves leading performance compared with various baselines especially when training data is scarce. CITE offers insights into leveraging in-domain text knowledge to reinforce data-efficient pathological image classification. Code is available at https://github.com/Yunkun-Zhang/CITE.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2307.14901

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Back to the Source: Diffusion-Driven Test-Time Adaptation

Gao, Jin, Zhang, Jialing, Liu, Xihui, Darrell, Trevor, Shelhamer, Evan, Wang, Dequan

arXiv.org Artificial IntelligenceJun-21-2023

Test-time adaptation harnesses test inputs to improve the accuracy of a model trained on source data when tested on shifted target data. Existing methods update the source model by (re-)training on each target domain. While effective, re-training is sensitive to the amount and order of the data and the hyperparameters for optimization. We instead update the target data, by projecting all test inputs toward the source domain with a generative diffusion model. Our diffusion-driven adaptation method, DDA, shares its models for classification and generation across all domains. Both models are trained on the source domain, then fixed during testing. We augment diffusion with image guidance and self-ensembling to automatically decide how much to adapt. Input adaptation by DDA is more robust than prior model adaptation approaches across a variety of corruptions, architectures, and data regimes on the ImageNet-C benchmark. With its input-wise updates, DDA succeeds where model adaptation degrades on too little data in small batches, dependent data in non-uniform order, or mixed data with multiple corruptions.

adaptation, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2207.03442

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Towards General Purpose Medical AI: Continual Learning Medical Foundation Model

Yi, Huahui, Qin, Ziyuan, Lao, Qicheng, Xu, Wei, Jiang, Zekun, Wang, Dequan, Zhang, Shaoting, Li, Kang

arXiv.org Artificial IntelligenceMar-12-2023

Inevitable domain and task discrepancies in real-world scenarios can impair the generalization performance of the pre-trained deep models for medical data. Therefore, we audaciously propose that we should build a general-purpose medical AI system that can be seamlessly adapted to downstream domains/tasks. Since the domain/task adaption procedures usually involve additional labeling work for the target data, designing a data-efficient adaption algorithm is desired to save the cost of transferring the learned knowledge. Our recent work found that vision-language models (VLMs) are efficient learners with extraordinary cross-domain ability. Therefore, in this work, we further explore the possibility of leveraging pre-trained VLMs as medical foundation models for building general-purpose medical AI, where we thoroughly investigate three machine-learning paradigms, i.e., domain/task-specialized learning, joint learning, and continual learning, for training the VLMs and evaluate their generalization performance on cross-domain and cross-task test sets. To alleviate the catastrophic forgetting during sequential training, we employ rehearsal learning and receive a sharp boost in terms of generalization capability. In a nutshell, our empirical evidence suggests that continual learning may be a practical and efficient learning paradigm for the medical foundation model. And we hope researchers can use our empirical evidence as basement to further explore the path toward medical foundation model.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2303.0658

Country: Asia > China (0.47)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.94)
Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

GACT: Activation Compressed Training for Generic Network Architectures

Liu, Xiaoxuan, Zheng, Lianmin, Wang, Dequan, Cen, Yukuo, Chen, Weize, Han, Xu, Chen, Jianfei, Liu, Zhiyuan, Tang, Jie, Gonzalez, Joey, Mahoney, Michael, Cheung, Alvin

arXiv.org Artificial IntelligenceSep-3-2022

Training large neural network (NN) models requires extensive memory resources, and Activation Compressed Training (ACT) is a promising approach to reduce training memory footprint. This paper presents GACT, an ACT framework to support a broad range of machine learning tasks for generic NN architectures with limited domain knowledge. By analyzing a linearized version of ACT's approximate gradient, we prove the convergence of GACT without prior knowledge on operator type or model architecture. To make training stable, we propose an algorithm that decides the compression ratio for each tensor by estimating its impact on the gradient at run time. We implement GACT as a PyTorch library that readily applies to any NN architecture. GACT reduces the activation memory for convolutional NNs, transformers, and graph NNs by up to 8.1x, enabling training with a 4.2x to 24.7x larger batch size, with negligible accuracy loss. We implement GACT as a PyTorch library at https://github.com/LiuXiaoxuanPKU/GACT-ICML.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2206.11357

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback