AITopics | Lin, Weixiong

Collaborating Authors

Lin, Weixiong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Squeeze Out Tokens from Sample for Finer-Grained Data Governance

Lin, Weixiong, Ju, Chen, Wang, Haicheng, Hu, Shengchao, Xiao, Shuai, Chen, Mengting, Jiao, Yuheng, Yao, Mingshuai, Lan, Jinsong, Liu, Qingwen, Chen, Ying

arXiv.org Artificial IntelligenceMar-18-2025

Widely observed data scaling laws, in which error falls off as a power of the training size, demonstrate the diminishing returns of unselective data expansion. Hence, data governance is proposed to downsize datasets through pruning non-informative samples. Yet, isolating the impact of a specific sample on overall model performance is challenging, due to the vast computation required for tryout all sample combinations. Current data governors circumvent this complexity by estimating sample contributions through heuristic-derived scalar scores, thereby discarding low-value ones. Despite thorough sample sieving, retained samples contain substantial undesired tokens intrinsically, underscoring the potential for further compression and purification. In this work, we upgrade data governance from a 'sieving' approach to a 'juicing' one. Instead of scanning for least-flawed samples, our dual-branch DataJuicer applies finer-grained intra-sample governance. It squeezes out informative tokens and boosts image-text alignments. Specifically, the vision branch retains salient image patches and extracts relevant object classes, while the text branch incorporates these classes to enhance captions. Consequently, DataJuicer yields more refined datasets through finer-grained governance. Extensive experiments across datasets demonstrate that DataJuicer significantly outperforms existing DataSieve in image-text retrieval, classification, and dense visual reasoning.

datajuicer, wang, zhang, (13 more...)

arXiv.org Artificial Intelligence

2503.14559

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Quality (0.93)
(2 more...)

Add feedback

Prompt Tuning with Diffusion for Few-Shot Pre-trained Policy Generalization

Hu, Shengchao, Zhao, Wanru, Lin, Weixiong, Shen, Li, Zhang, Ya, Tao, Dacheng

arXiv.org Artificial IntelligenceNov-2-2024

Offline reinforcement learning (RL) methods harness previous experiences to derive an optimal policy, forming the foundation for pre-trained large-scale models (PLMs). When encountering tasks not seen before, PLMs often utilize several expert trajectories as prompts to expedite their adaptation to new requirements. Though a range of prompt-tuning methods have been proposed to enhance the quality of prompts, these methods often face optimization restrictions due to prompt initialization, which can significantly constrain the exploration domain and potentially lead to suboptimal solutions. To eliminate the reliance on the initial prompt, we shift our perspective towards the generative model, framing the prompt-tuning process as a form of conditional generative modeling, where prompts are generated from random noise. Our innovation, the Prompt Diffuser, leverages a conditional diffusion model to produce prompts of exceptional quality. Central to our framework is the approach to trajectory reconstruction and the meticulous integration of downstream task guidance during the training phase. Further experimental results underscore the potency of the Prompt Diffuser as a robust and effective tool for the prompt-tuning process, demonstrating strong performance in the meta-RL tasks.

machine learning, natural language, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2411.01168

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Towards Building Multilingual Language Model for Medicine

Qiu, Pengcheng, Wu, Chaoyi, Zhang, Xiaoman, Lin, Weixiong, Wang, Haicheng, Zhang, Ya, Wang, Yanfeng, Xie, Weidi

arXiv.org Artificial IntelligenceJun-2-2024

In the recent literature, large language models (LLMs) have demonstrated great promise in healthcare, for example, closed-source models such as GPT-4 [1] and MedPalm-2 [38] have shown remarkable performance, and successfully passed the United States Medical Licensing Examination (USMLE). Concurrently, open-source models like Llama 2 have also facilitated the development of specialized language models for medicine, such as MEDITRON, PMC-LLaMA, MedAlpaca, and ChatDoctors [48, 11, 25, 6], gradually bridging the performance gap with their closed-source peers. Despite these advancements, the primary focus on English-language applications by these sophisticated medical language models has constrained their potential reach, limiting the benefits to a wider, linguistically diverse audience. In the realm of open-source multilingual Large Language Models (LLMs), exemplified by BLOOM [37] and the more recent InternLM 2 [42], a notable challenge persists despite their training on diverse multilingual corpora, that is, they exhibit unsatisfactory performance on medical queries in non-English languages, a discrepancy primarily attributed to the under-representation of medical content in these general datasets. This paper endeavors to bridge this gap by developing an open-source, multilingual language model for healthcare. As shown by Figure 1, our contribution is threefold: firstly, we gather a multilingual medical corpus designed for auto-regressive training, this aims to lay a robust foundation that accurately reflects the linguistic diversity and complexity of the medical domain; secondly, to monitor the progress, we introduce a new comprehensive multilingual medical question-answering (QA) benchmark, enabling the evaluation on multi-choice QA and rationale ability of different language models under both zero-shot and fine-tuning settings; lastly, we have tested a wide spectrum of existing language models, together with those that have undergone auto-regressive pre-training on our corpus. Through this comprehensive evaluation, we aim to provide valuable insights into the models' capabilities and fostering a deeper understanding of the intricacies involved in multilingual medical query processing. For auto-regressive training, we have developed a large-scale Multilingual Medical Corpus (MMedC), amassing over 25.5 billion medical-related tokens across six primary languages: English, Chinese, Japanese, French, Russian, and Spanish. This diverse dataset was compiled from four distinct sources: (i) we devised an automatic pipeline to filter medical-related content from the broad multilingual corpus, ensuring a focused and relevant dataset, (ii) we curated an extensive collection of medical textbooks in various languages, and converted them into texts with carefully designed pre-processing, e.g., Optical Character Recognition (OCR), heuristic data filtering, etc.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2402.13963

Country: North America > United States > Pennsylvania (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Diagnostic Medicine (0.93)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Can GPT-4V(ision) Serve Medical Applications? Case Studies on GPT-4V for Multimodal Medical Diagnosis

Wu, Chaoyi, Lei, Jiayu, Zheng, Qiaoyu, Zhao, Weike, Lin, Weixiong, Zhang, Xiaoman, Zhou, Xiao, Zhao, Ziheng, Zhang, Ya, Wang, Yanfeng, Xie, Weidi

arXiv.org Artificial IntelligenceDec-4-2023

Driven by the large foundation models, the development of artificial intelligence has witnessed tremendous progress lately, leading to a surge of general interest from the public. In this study, we aim to assess the performance of OpenAI's newest model, GPT-4V(ision), specifically in the realm of multimodal medical diagnosis. Our evaluation encompasses 17 human body systems, including Central Nervous System, Head and Neck, Cardiac, Chest, Hematology, Hepatobiliary, Gastrointestinal, Urogenital, Gynecology, Obstetrics, Breast, Musculoskeletal, Spine, Vascular, Oncology, Trauma, Pediatrics, with images taken from 8 modalities used in daily clinic routine, e.g., X-ray, Computed Tomography (CT), Magnetic Resonance Imaging (MRI), Positron Emission Tomography (PET), Digital Subtraction Angiography (DSA), Mammography, Ultrasound, and Pathology. We probe the GPT-4V's ability on multiple clinical tasks with or without patent history provided, including imaging modality and anatomy recognition, disease diagnosis, report generation, disease localisation. Our observation shows that, while GPT-4V demonstrates proficiency in distinguishing between medical image modalities and anatomy, it faces significant challenges in disease diagnosis and generating comprehensive reports. These findings underscore that while large multimodal models have made significant advancements in computer vision and natural language processing, it remains far from being used to effectively support real-world medical applications and clinical decision-making.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2310.09909

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.85)

Industry:

Health & Medicine > Therapeutic Area > Pediatrics/Neonatology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)
(6 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

PMC-LLaMA: Towards Building Open-source Language Models for Medicine

Wu, Chaoyi, Lin, Weixiong, Zhang, Xiaoman, Zhang, Ya, Wang, Yanfeng, Xie, Weidi

arXiv.org Artificial IntelligenceAug-25-2023

Recently, Large Language Models (LLMs) have showcased remarkable capabilities in natural language understanding. While demonstrating proficiency in everyday conversations and question-answering situations, these models frequently struggle in domains that require precision, such as medical applications, due to their lack of domain-specific knowledge. In this paper, we describe the procedure for building a powerful, open-source language model specifically designed for medicine applications, termed as PMC-LLaMA. Our contributions are threefold: (i) we systematically investigate the process of adapting a general-purpose foundation language model towards medical domain, this involves data-centric knowledge injection through the integration of 4.8M biomedical academic papers and 30K medical textbooks, as well as comprehensive fine-tuning for alignment with domain-specific instructions; (ii) we contribute a large-scale, comprehensive dataset for instruction tuning. This dataset encompasses medical question-answering (QA), rationale for reasoning, and conversational dialogues, comprising a total of 202M tokens; (iii) we conduct thorough ablation studies to demonstrate the effectiveness of each proposed component. While evaluating on various public medical question-answering benchmarks, our lightweight PMCLLaMA, which consists of only 13 billion parameters, exhibits superior performance, even surpassing ChatGPT. All models, codes, datasets can be found in https://github.com/chaoyi-wu/PMC-LLaMA.

machine learning, natural language, question answering, (17 more...)

arXiv.org Artificial Intelligence

2304.14454

Country: Asia > China (0.14)

Genre:

Research Report (0.50)
Instructional Material (0.31)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)

Add feedback

PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents

Lin, Weixiong, Zhao, Ziheng, Zhang, Xiaoman, Wu, Chaoyi, Zhang, Ya, Wang, Yanfeng, Xie, Weidi

arXiv.org Artificial IntelligenceMar-13-2023

Foundation models trained on large-scale dataset gain a recent surge in CV and NLP. In contrast, development in biomedical domain lags far behind due to data scarcity. To address this issue, we build and release PMC-OA, a biomedical dataset with 1.6M image-caption pairs collected from PubMedCentral's OpenAccess subset, which is 8 times larger than before. PMC-OA covers diverse modalities or diseases, with majority of the image-caption samples aligned at finer-grained level, i.e., subfigure and subcaption. While pretraining a CLIP-style model on PMC-OA, our model named PMC-CLIP achieves state-of-the-art results on various downstream tasks, including image-text retrieval on ROCO, MedMNIST image classification, Medical VQA, i.e. +8.1% R@10 on image-text retrieval, +3.9% accuracy on image classification.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2303.0724

Country:

Europe (0.46)
Asia > China (0.29)

Genre: Research Report (0.64)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.89)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.70)

Add feedback