AITopics | Cai, Xiaochen

Collaborating Authors

Cai, Xiaochen

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SciAssess: Benchmarking LLM Proficiency in Scientific Literature Analysis

Cai, Hengxing, Cai, Xiaochen, Chang, Junhan, Li, Sihang, Yao, Lin, Wang, Changxin, Gao, Zhifeng, Wang, Hongshuai, Li, Yongge, Lin, Mujie, Yang, Shuwen, Wang, Jiankun, Xu, Mingjun, Huang, Jin, Xi, Fang, Zhuang, Jiaxi, Yin, Yuqi, Li, Yaqi, Chen, Changhong, Cheng, Zheng, Zhao, Zifeng, Zhang, Linfeng, Ke, Guolin

arXiv.org Artificial IntelligenceJun-18-2024

Recent breakthroughs in Large Language Models (LLMs) have revolutionized natural language understanding and generation, sparking significant interest in applying them to scientific literature analysis. However, existing benchmarks fail to adequately evaluate the proficiency of LLMs in this domain, particularly in scenarios requiring higher-level abilities beyond mere memorization and the handling of multimodal data. In response to this gap, we introduce SciAssess, a benchmark specifically designed for the comprehensive evaluation of LLMs in scientific literature analysis. SciAssess aims to thoroughly assess the efficacy of LLMs by focusing on their capabilities in Memorization (L1), Comprehension (L2), and Analysis \& Reasoning (L3). It encompasses a variety of tasks drawn from diverse scientific fields, including fundamental science, alloy materials, biomedicine, drug discovery, and organic materials. To ensure the reliability of SciAssess, rigorous quality control measures have been implemented, ensuring accuracy, anonymization, and compliance with copyright standards. SciAssess evaluates 11 LLMs, including GPT, Claude, and Gemini, highlighting their strengths and areas for improvement. This evaluation supports the ongoing development of LLM applications in the analysis of scientific literature. SciAssess and its resources are available at \url{https://sci-assess.github.io/}.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2403.01976

Country:

Europe > Germany (0.14)
Asia > China (0.14)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Genetic Disease (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Uni-SMART: Universal Science Multimodal Analysis and Research Transformer

Cai, Hengxing, Cai, Xiaochen, Yang, Shuwen, Wang, Jiankun, Yao, Lin, Gao, Zhifeng, Chang, Junhan, Li, Sihang, Xu, Mingjun, Wang, Changxin, Wang, Hongshuai, Li, Yongge, Lin, Mujie, Li, Yaqi, Yin, Yuqi, Zhang, Linfeng, Ke, Guolin

arXiv.org Artificial IntelligenceJun-15-2024

In scientific research and its application, scientific literature analysis is crucial as it allows researchers to build on the work of others. However, the fast growth of scientific knowledge has led to a massive increase in scholarly articles, making in-depth literature analysis increasingly challenging and time-consuming. The emergence of Large Language Models (LLMs) has offered a new way to address this challenge. Known for their strong abilities in summarizing texts, LLMs are seen as a potential tool to improve the analysis of scientific literature. However, existing LLMs have their own limits. Scientific literature often includes a wide range of multimodal elements, such as tables, charts, and molecule, which are hard for text-focused LLMs to understand and analyze. This issue points to the urgent need for new solutions that can fully understand and analyze multimodal content in scientific literature. To answer this demand, we present \textbf{Uni-SMART} (Universal Science Multimodal Analysis and Research Transformer), an innovative model designed for in-depth understanding of multimodal scientific literature. Through rigorous quantitative evaluation across several domains, Uni-SMART demonstrates superior performance over other text-focused LLMs. Furthermore, our exploration extends to practical applications, including patent infringement detection and nuanced analysis of charts. These applications not only highlight Uni-SMART's adaptability but also its potential to revolutionize how we interact with scientific literature.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2403.10301

Country: Europe > Switzerland (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry:

Law > Intellectual Property & Technology Law (0.57)
Health & Medicine > Pharmaceuticals & Biotechnology (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

The Multi-modality Cell Segmentation Challenge: Towards Universal Solutions

Ma, Jun, Xie, Ronald, Ayyadhury, Shamini, Ge, Cheng, Gupta, Anubha, Gupta, Ritu, Gu, Song, Zhang, Yao, Lee, Gihun, Kim, Joonkee, Lou, Wei, Li, Haofeng, Upschulte, Eric, Dickscheid, Timo, de Almeida, José Guilherme, Wang, Yixin, Han, Lin, Yang, Xin, Labagnara, Marco, Rahi, Sahand Jamal, Kempster, Carly, Pollitt, Alice, Espinosa, Leon, Mignot, Tâm, Middeke, Jan Moritz, Eckardt, Jan-Niklas, Li, Wangkai, Li, Zhaoyang, Cai, Xiaochen, Bai, Bizhe, Greenwald, Noah F., Van Valen, David, Weisbart, Erin, Cimini, Beth A., Li, Zhuoshi, Zuo, Chao, Brück, Oscar, Bader, Gary D., Wang, Bo

arXiv.org Artificial IntelligenceAug-10-2023

Cell segmentation is a critical step for quantitative single-cell analysis in microscopy images. Existing cell segmentation methods are often tailored to specific modalities or require manual interventions to specify hyperparameters in different experimental settings. Here, we present a multi-modality cell segmentation benchmark, comprising over 1500 labeled images derived from more than 50 diverse biological experiments. The top participants developed a Transformer-based deeplearning algorithm that not only exceeds existing methods, but can also be applied to diverse microscopy images across imaging platforms and tissue types without manual parameter adjustments. This benchmark and the improved algorithm offer promising avenues for more accurate and versatile cell analysis in microscopy imaging. Cell segmentation is a fundamental task that is universally required for biological image analysis across a large number of different experimental settings and imaging modalities. For example, in multiplexed fluorescence image-based cancer microenvironment analysis, cell segmentation is the prerequisite for the identification of tumor sub-types, composition, and organization, which can lead to important biological insights [1]-[3]. However, the development of a universal and automatic cell segmentation technique continues to pose significant challenges due to the extensive diversity observed in microscopy images. This diversity arises from variations in cell origins, microscopy types, staining techniques, and cell morphologies. Recent advances [4], [5] have successfully demonstrated the feasibility of automatic and precise cellular segmentation for specific microscopy image types and cell types, such as fluorescence and mass spectrometry images [6], [7], differential interference contrast images of platelets [8], bacteria images [9] and yeast images [10], [11], but the selection of appropriate segmentation models remains a non-trivial task for non-expert users in conventional biology laboratories. Efforts have been made towards the development of generalized cell segmentation algorithms [9], [12], [13]. However, these algorithms were primarily trained using datasets consisting of gray-scale images and two-channel fluorescent images, lacking the necessary diversity to ensure robust generalization across a wide range of imaging modalities. For example, the segmentation models have struggled to perform effectively on RGB images, such as bone marrow aspirate slides stained with Jenner-Giemsa. Furthermore, these models often require manual selection of both the model type and the specific image channel to be segmented, posing challenges for biologists with limited computational expertise. Biomedical image data science competitions have emerged as an effective way to accelerate the development of cutting-edge algorithms [14], [15].

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2308.05864

Country:

Europe (0.67)
Asia > China (0.28)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > Experimental Study (0.68)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.93)
Health & Medicine > Diagnostic Medicine (0.88)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback