Cai, Xiaochen
SciAssess: Benchmarking LLM Proficiency in Scientific Literature Analysis
Cai, Hengxing, Cai, Xiaochen, Chang, Junhan, Li, Sihang, Yao, Lin, Wang, Changxin, Gao, Zhifeng, Wang, Hongshuai, Li, Yongge, Lin, Mujie, Yang, Shuwen, Wang, Jiankun, Xu, Mingjun, Huang, Jin, Xi, Fang, Zhuang, Jiaxi, Yin, Yuqi, Li, Yaqi, Chen, Changhong, Cheng, Zheng, Zhao, Zifeng, Zhang, Linfeng, Ke, Guolin
Recent breakthroughs in Large Language Models (LLMs) have revolutionized natural language understanding and generation, sparking significant interest in applying them to scientific literature analysis. However, existing benchmarks fail to adequately evaluate the proficiency of LLMs in this domain, particularly in scenarios requiring higher-level abilities beyond mere memorization and the handling of multimodal data. In response to this gap, we introduce SciAssess, a benchmark specifically designed for the comprehensive evaluation of LLMs in scientific literature analysis. SciAssess aims to thoroughly assess the efficacy of LLMs by focusing on their capabilities in Memorization (L1), Comprehension (L2), and Analysis \& Reasoning (L3). It encompasses a variety of tasks drawn from diverse scientific fields, including fundamental science, alloy materials, biomedicine, drug discovery, and organic materials. To ensure the reliability of SciAssess, rigorous quality control measures have been implemented, ensuring accuracy, anonymization, and compliance with copyright standards. SciAssess evaluates 11 LLMs, including GPT, Claude, and Gemini, highlighting their strengths and areas for improvement. This evaluation supports the ongoing development of LLM applications in the analysis of scientific literature. SciAssess and its resources are available at \url{https://sci-assess.github.io/}.
Uni-SMART: Universal Science Multimodal Analysis and Research Transformer
Cai, Hengxing, Cai, Xiaochen, Yang, Shuwen, Wang, Jiankun, Yao, Lin, Gao, Zhifeng, Chang, Junhan, Li, Sihang, Xu, Mingjun, Wang, Changxin, Wang, Hongshuai, Li, Yongge, Lin, Mujie, Li, Yaqi, Yin, Yuqi, Zhang, Linfeng, Ke, Guolin
In scientific research and its application, scientific literature analysis is crucial as it allows researchers to build on the work of others. However, the fast growth of scientific knowledge has led to a massive increase in scholarly articles, making in-depth literature analysis increasingly challenging and time-consuming. The emergence of Large Language Models (LLMs) has offered a new way to address this challenge. Known for their strong abilities in summarizing texts, LLMs are seen as a potential tool to improve the analysis of scientific literature. However, existing LLMs have their own limits. Scientific literature often includes a wide range of multimodal elements, such as tables, charts, and molecule, which are hard for text-focused LLMs to understand and analyze. This issue points to the urgent need for new solutions that can fully understand and analyze multimodal content in scientific literature. To answer this demand, we present \textbf{Uni-SMART} (Universal Science Multimodal Analysis and Research Transformer), an innovative model designed for in-depth understanding of multimodal scientific literature. Through rigorous quantitative evaluation across several domains, Uni-SMART demonstrates superior performance over other text-focused LLMs. Furthermore, our exploration extends to practical applications, including patent infringement detection and nuanced analysis of charts. These applications not only highlight Uni-SMART's adaptability but also its potential to revolutionize how we interact with scientific literature.
The Multi-modality Cell Segmentation Challenge: Towards Universal Solutions
Ma, Jun, Xie, Ronald, Ayyadhury, Shamini, Ge, Cheng, Gupta, Anubha, Gupta, Ritu, Gu, Song, Zhang, Yao, Lee, Gihun, Kim, Joonkee, Lou, Wei, Li, Haofeng, Upschulte, Eric, Dickscheid, Timo, de Almeida, José Guilherme, Wang, Yixin, Han, Lin, Yang, Xin, Labagnara, Marco, Rahi, Sahand Jamal, Kempster, Carly, Pollitt, Alice, Espinosa, Leon, Mignot, Tâm, Middeke, Jan Moritz, Eckardt, Jan-Niklas, Li, Wangkai, Li, Zhaoyang, Cai, Xiaochen, Bai, Bizhe, Greenwald, Noah F., Van Valen, David, Weisbart, Erin, Cimini, Beth A., Li, Zhuoshi, Zuo, Chao, Brück, Oscar, Bader, Gary D., Wang, Bo
Cell segmentation is a critical step for quantitative single-cell analysis in microscopy images. Existing cell segmentation methods are often tailored to specific modalities or require manual interventions to specify hyperparameters in different experimental settings. Here, we present a multi-modality cell segmentation benchmark, comprising over 1500 labeled images derived from more than 50 diverse biological experiments. The top participants developed a Transformer-based deeplearning algorithm that not only exceeds existing methods, but can also be applied to diverse microscopy images across imaging platforms and tissue types without manual parameter adjustments. This benchmark and the improved algorithm offer promising avenues for more accurate and versatile cell analysis in microscopy imaging. Cell segmentation is a fundamental task that is universally required for biological image analysis across a large number of different experimental settings and imaging modalities. For example, in multiplexed fluorescence image-based cancer microenvironment analysis, cell segmentation is the prerequisite for the identification of tumor sub-types, composition, and organization, which can lead to important biological insights [1]-[3]. However, the development of a universal and automatic cell segmentation technique continues to pose significant challenges due to the extensive diversity observed in microscopy images. This diversity arises from variations in cell origins, microscopy types, staining techniques, and cell morphologies. Recent advances [4], [5] have successfully demonstrated the feasibility of automatic and precise cellular segmentation for specific microscopy image types and cell types, such as fluorescence and mass spectrometry images [6], [7], differential interference contrast images of platelets [8], bacteria images [9] and yeast images [10], [11], but the selection of appropriate segmentation models remains a non-trivial task for non-expert users in conventional biology laboratories. Efforts have been made towards the development of generalized cell segmentation algorithms [9], [12], [13]. However, these algorithms were primarily trained using datasets consisting of gray-scale images and two-channel fluorescent images, lacking the necessary diversity to ensure robust generalization across a wide range of imaging modalities. For example, the segmentation models have struggled to perform effectively on RGB images, such as bone marrow aspirate slides stained with Jenner-Giemsa. Furthermore, these models often require manual selection of both the model type and the specific image channel to be segmented, posing challenges for biologists with limited computational expertise. Biomedical image data science competitions have emerged as an effective way to accelerate the development of cutting-edge algorithms [14], [15].