AITopics | He, Junjun

Collaborating Authors

He, Junjun

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Fine-tuning Dataset and Benchmark for Large Language Models for Protein Understanding

Shen, Yiqing, Chen, Zan, Mamalakis, Michail, He, Luhan, Xia, Haiyang, Li, Tianbin, Su, Yanzhou, He, Junjun, Wang, Yu Guang

arXiv.org Artificial IntelligenceJul-8-2024

The parallels between protein sequences and natural language in their sequential structures have inspired the application of large language models (LLMs) to protein understanding. Despite the success of LLMs in NLP, their effectiveness in comprehending protein sequences remains an open question, largely due to the absence of datasets linking protein sequences to descriptive text. Researchers have then attempted to adapt LLMs for protein understanding by integrating a protein sequence encoder with a pre-trained LLM. However, this adaptation raises a fundamental question: "Can LLMs, originally designed for NLP, effectively comprehend protein sequences as a form of language?" Current datasets fall short in addressing this question due to the lack of a direct correlation between protein sequences and corresponding text descriptions, limiting the ability to train and evaluate LLMs for protein understanding effectively. To bridge this gap, we introduce ProteinLMDataset, a dataset specifically designed for further self-supervised pretraining and supervised fine-tuning (SFT) of LLMs to enhance their capability for protein sequence comprehension. Specifically, ProteinLMDataset includes 17.46 billion tokens for pretraining and 893,000 instructions for SFT. Additionally, we present ProteinLMBench, the first benchmark dataset consisting of 944 manually verified multiple-choice questions for assessing the protein understanding capabilities of LLMs. ProteinLMBench incorporates protein-related details and sequences in multiple languages, establishing a new standard for evaluating LLMs' abilities in protein comprehension. The large language model InternLM2-7B, pretrained and fine-tuned on the ProteinLMDataset, outperforms GPT-4 on ProteinLMBench, achieving the highest accuracy score.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2406.0554

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models

Gao, Peng, Zhang, Renrui, Liu, Chris, Qiu, Longtian, Huang, Siyuan, Lin, Weifeng, Zhao, Shitian, Geng, Shijie, Lin, Ziyi, Jin, Peng, Zhang, Kaipeng, Shao, Wenqi, Xu, Chao, He, Conghui, He, Junjun, Shao, Hao, Lu, Pan, Li, Hongsheng, Qiao, Yu

arXiv.org Artificial IntelligenceFeb-8-2024

We propose SPHINX-X, an extensive Multimodality Large Language Model (MLLM) series developed upon SPHINX. To improve the architecture and training efficiency, we modify the SPHINX framework by removing redundant visual encoders, bypassing fully-padded sub-images with skip tokens, and simplifying multi-stage training into a one-stage all-in-one paradigm. To fully unleash the potential of MLLMs, we assemble a comprehensive multi-domain and multimodal dataset covering publicly available resources in language, vision, and vision-language tasks. We further enrich this collection with our curated OCR intensive and Set-of-Mark datasets, extending the diversity and generality. By training over different base LLMs including TinyLlama1.1B, InternLM2-7B, LLaMA2-13B, and Mixtral8x7B, we obtain a spectrum of MLLMs that vary in parameter size and multilingual capabilities. Comprehensive benchmarking reveals a strong correlation between the multi-modal performance with the data and parameter scales. Code and models are released at https://github.com/Alpha-VLLM/LLaMA2-Accessory

large language model, machine learning, sphinx-x, (14 more...)

arXiv.org Artificial Intelligence

2402.05935

Country: Europe (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Towards the Unification of Generative and Discriminative Visual Foundation Model: A Survey

Liu, Xu, Zhou, Tong, Wang, Yuanxin, Wang, Yuping, Cao, Qinjingwen, Du, Weizhi, Yang, Yonghuan, He, Junjun, Qiao, Yu, Shen, Yiqing

arXiv.org Artificial IntelligenceDec-15-2023

The advent of foundation models, which are pre-trained on vast datasets, has ushered in a new era of computer vision, characterized by their robustness and remarkable zero-shot generalization capabilities. Mirroring the transformative impact of foundation models like large language models (LLMs) in natural language processing, visual foundation models (VFMs) have become a catalyst for groundbreaking developments in computer vision. This review paper delineates the pivotal trajectories of VFMs, emphasizing their scalability and proficiency in generative tasks such as text-to-image synthesis, as well as their adeptness in discriminative tasks including image segmentation. While generative and discriminative models have historically charted distinct paths, we undertake a comprehensive examination of the recent strides made by VFMs in both domains, elucidating their origins, seminal breakthroughs, and pivotal methodologies. Additionally, we collate and discuss the extensive resources that facilitate the development of VFMs and address the challenges that pave the way for future research endeavors. A crucial direction for forthcoming innovation is the amalgamation of generative and discriminative paradigms. The nascent application of generative models within discriminative contexts signifies the early stages of this confluence. This survey aspires to be a contemporary compendium for scholars and practitioners alike, charting the course of VFMs and illuminating their multifaceted landscape.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2312.10163

Country: North America > United States > California (0.14)

Genre:

Research Report > Promising Solution (1.00)
Overview (1.00)

Industry:

Information Technology (0.67)
Media (0.46)
Health & Medicine > Therapeutic Area > Neurology (0.45)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.70)

Add feedback

Enhancing Medical Task Performance in GPT-4V: A Comprehensive Study on Prompt Engineering Strategies

Chen, Pengcheng, Huang, Ziyan, Deng, Zhongying, Li, Tianbin, Su, Yanzhou, Wang, Haoyu, Ye, Jin, Qiao, Yu, He, Junjun

arXiv.org Artificial IntelligenceDec-12-2023

OpenAI's latest large vision-language model (LVLM), GPT-4V(ision), has piqued considerable interest for its potential in medical applications. Despite its promise, recent studies and internal reviews highlight its underperformance in specialized medical tasks. This paper explores the boundary of GPT-4V's capabilities in medicine, particularly in processing complex imaging data from endoscopies, CT scans, and MRIs etc. Leveraging open-source datasets, we assessed its foundational competencies, identifying substantial areas for enhancement. Our research emphasizes prompt engineering, an often-underutilized strategy for improving AI responsiveness. Through iterative testing, we refined the model's prompts, significantly improving its interpretative accuracy and relevance in medical imaging. From our comprehensive evaluations, we distilled 10 effective prompt engineering techniques, each fortifying GPT-4V's medical acumen. These methodical enhancements facilitate more reliable, precise, and clinically valuable insights from GPT-4V, advancing its operability in critical healthcare environments. Our findings are pivotal for those employing AI in medicine, providing clear, actionable guidance on harnessing GPT-4V's full diagnostic potential.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2312.04344

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Self Pre-training with Masked Autoencoders for Medical Image Classification and Segmentation

Zhou, Lei, Liu, Huidong, Bae, Joseph, He, Junjun, Samaras, Dimitris, Prasanna, Prateek

arXiv.org Artificial IntelligenceApr-21-2023

Masked Autoencoder (MAE) has recently been shown to be effective in pre-training Vision Transformers (ViT) for natural image analysis. By reconstructing full images from partially masked inputs, a ViT encoder aggregates contextual information to infer masked image regions. We believe that this context aggregation ability is particularly essential to the medical image domain where each anatomical structure is functionally and mechanically connected to other structures and regions. Because there is no ImageNet-scale medical image dataset for pre-training, we investigate a self pre-training paradigm with MAE for medical image analysis tasks. Our method pre-trains a ViT on the training set of the target data instead of another dataset. Thus, self pre-training can benefit more scenarios where pre-training data is hard to acquire. Our experimental results show that MAE self pre-training markedly improves diverse medical image tasks including chest X-ray disease classification, abdominal CT multi-organ segmentation, and MRI brain tumor segmentation. Code is available at https://github.com/cvlab-stonybrook/SelfMedMAE

artificial intelligence, machine learning, segmentation, (12 more...)

arXiv.org Artificial Intelligence

2203.05573

Genre:

Instructional Material > Online (0.40)
Instructional Material > Course Syllabus & Notes (0.40)
Research Report > New Finding (0.34)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

OpenKBP-Opt: An international and reproducible evaluation of 76 knowledge-based planning pipelines

Babier, Aaron, Mahmood, Rafid, Zhang, Binghao, Alves, Victor G. L., Barragán-Montero, Ana Maria, Beaudry, Joel, Cardenas, Carlos E., Chang, Yankui, Chen, Zijie, Chun, Jaehee, Diaz, Kelly, Eraso, Harold David, Faustmann, Erik, Gaj, Sibaji, Gay, Skylar, Gronberg, Mary, Guo, Bingqi, He, Junjun, Heilemann, Gerd, Hira, Sanchit, Huang, Yuliang, Ji, Fuxin, Jiang, Dashan, Giraldo, Jean Carlo Jimenez, Lee, Hoyeon, Lian, Jun, Liu, Shuolin, Liu, Keng-Chi, Marrugo, José, Miki, Kentaro, Nakamura, Kunio, Netherton, Tucker, Nguyen, Dan, Nourzadeh, Hamidreza, Osman, Alexander F. I., Peng, Zhao, Muñoz, José Darío Quinto, Ramsl, Christian, Rhee, Dong Joo, Rodriguez, Juan David, Shan, Hongming, Siebers, Jeffrey V., Soomro, Mumtaz H., Sun, Kay, Hoyos, Andrés Usuga, Valderrama, Carlos, Verbeek, Rob, Wang, Enpei, Willems, Siri, Wu, Qi, Xu, Xuanang, Yang, Sen, Yuan, Lulin, Zhu, Simeng, Zimmermann, Lukas, Moore, Kevin L., Purdie, Thomas G., McNiven, Andrea L., Chan, Timothy C. Y.

arXiv.org Artificial IntelligenceFeb-16-2022

We establish an open framework for developing plan optimization models for knowledge-based planning (KBP) in radiotherapy. Our framework includes reference plans for 100 patients with head-and-neck cancer and high-quality dose predictions from 19 KBP models that were developed by different research groups during the OpenKBP Grand Challenge. The dose predictions were input to four optimization models to form 76 unique KBP pipelines that generated 7600 plans. The predictions and plans were compared to the reference plans via: dose score, which is the average mean absolute voxel-by-voxel difference in dose a model achieved; the deviation in dose-volume histogram (DVH) criterion; and the frequency of clinical planning criteria satisfaction. We also performed a theoretical investigation to justify our dose mimicking models. The range in rank order correlation of the dose score between predictions and their KBP pipelines was 0.50 to 0.62, which indicates that the quality of the predictions is generally positively correlated with the quality of the plans. Additionally, compared to the input predictions, the KBP-generated plans performed significantly better (P<0.05; one-sided Wilcoxon test) on 18 of 23 DVH criteria. Similarly, each optimization model generated plans that satisfied a higher percentage of criteria than the reference plans. Lastly, our theoretical investigation demonstrated that the dose mimicking models generated plans that are also optimal for a conventional planning model. This was the largest international effort to date for evaluating the combination of KBP prediction and optimization models. In the interest of reproducibility, our data and code is freely available at https://github.com/ababier/open-kbp-opt.

artificial intelligence, knowledge-based planning pipeline, oncology, (3 more...)

arXiv.org Artificial Intelligence

2202.08303

Genre: Research Report > New Finding (0.53)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.73)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.60)

Add feedback

MIA-Prognosis: A Deep Learning Framework to Predict Therapy Response

Yang, Jiancheng, Chen, Jiajun, Kuang, Kaiming, Lin, Tiancheng, He, Junjun, Ni, Bingbing

arXiv.org Artificial IntelligenceOct-8-2020

Predicting clinical outcome is remarkably important but challenging. Research efforts have been paid on seeking significant biomarkers associated with the therapy response or/and patient survival. However, these biomarkers are generally costly and invasive, and possibly dissatifactory for novel therapy. On the other hand, multi-modal, heterogeneous, unaligned temporal data is continuously generated in clinical practice. This paper aims at a unified deep learning approach to predict patient prognosis and therapy response, with easily accessible data, e.g., radiographics, laboratory and clinical information. Prior arts focus on modeling single data modality, or ignore the temporal changes. Importantly, the clinical time series is asynchronous in practice, i.e., recorded with irregular intervals. In this study, we formalize the prognosis modeling as a multi-modal asynchronous time series classification task, and propose a MIA-Prognosis framework with Measurement, Intervention and Assessment (MIA) information to predict therapy response, where a Simple Temporal Attention (SimTA) module is developed to process the asynchronous time series. Experiments on synthetic dataset validate the superiory of SimTA over standard RNN-based approaches. Furthermore, we experiment the proposed method on an in-house, retrospective dataset of real-world non-small cell lung cancer patients under anti-PD-1 immunotherapy. The proposed method achieves promising performance on predicting the immunotherapy response. Notably, our predictive model could further stratify low-risk and high-risk patients in terms of long-term survival.

deep learning, neural network, time series, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-030-59713-9_21

2010.04062

Country: Asia > China (0.15)

Genre:

Research Report > New Finding (0.90)
Research Report > Experimental Study (0.90)

Industry: Health & Medicine > Therapeutic Area > Oncology > Lung Cancer (0.92)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback