AITopics | Wang, Shanshan

Collaborating Authors

Wang, Shanshan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Identifying Ising and percolation phase transitions based on KAN method

Xu, Dian, Wang, Shanshan, Li, Wei, Deng, Weibing, Gao, Feng, Shen, Jianmin

arXiv.org Artificial IntelligenceMar-5-2025

Modern machine learning, grounded in the Universal Approximation Theorem, has achieved significant success in the study of phase transitions in both equilibrium and non-equilibrium systems. However, identifying the critical points of percolation models using raw configurations remains a challenging and intriguing problem. This paper proposes the use of the Kolmogorov-Arnold Network, which is based on the Kolmogorov-Arnold Representation Theorem, to input raw configurations into a learning model. The results demonstrate that the KAN can indeed predict the critical points of percolation models. Further observation reveals that, apart from models associated with the density of occupied points, KAN is also capable of effectively achieving phase classification for models where the sole alteration pertains to the orientation of spins, resulting in an order parameter that manifests as an external magnetic flux, such as the Ising model.

artificial intelligence, machine learning, percolation model, (16 more...)

arXiv.org Artificial Intelligence

2503.17996

Country: Asia > China (0.15)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning states enhanced knowledge tracing: Simulating the diversity in real-world learning process

Wang, Shanshan, Zhang, Xueying, Wang, Keyang, Yang, Xun, Zhang, Xingyi

arXiv.org Artificial IntelligenceDec-27-2024

The Knowledge Tracing (KT) task focuses on predicting a learner's future performance based on the historical interactions. The knowledge state plays a key role in learning process. However, considering that the knowledge state is influenced by various learning factors in the interaction process, such as the exercises similarities, responses reliability and the learner's learning state. Previous models still face two major limitations. First, due to the exercises differences caused by various complex reasons and the unreliability of responses caused by guessing behavior, it is hard to locate the historical interaction which is most relevant to the current answered exercise. Second, the learning state is also a key factor to influence the knowledge state, which is always ignored by previous methods. To address these issues, we propose a new method named Learning State Enhanced Knowledge Tracing (LSKT). Firstly, to simulate the potential differences in interactions, inspired by Item Response Theory~(IRT) paradigm, we designed three different embedding methods ranging from coarse-grained to fine-grained views and conduct comparative analysis on them. Secondly, we design a learning state extraction module to capture the changing learning state during the learning process of the learner. In turn, with the help of the extracted learning state, a more detailed knowledge state could be captured. Experimental results on four real-world datasets show that our LSKT method outperforms the current state-of-the-art methods.

artificial intelligence, learner, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2412.1955

Country: Asia > China > Anhui Province (0.14)

Genre: Research Report > New Finding (0.67)

Industry: Education > Educational Technology > Educational Software > Computer Based Training (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Optimized Vessel Segmentation: A Structure-Agnostic Approach with Small Vessel Enhancement and Morphological Correction

Song, Dongning, Huang, Weijian, Liu, Jiarun, Islam, Md Jahidul, Yang, Hao, Wang, Shanshan

arXiv.org Artificial IntelligenceNov-22-2024

Accurate segmentation of blood vessels is essential for various clinical assessments and postoperative analyses. However, the inherent challenges of vascular imaging, such as sparsity, fine granularity, low contrast, data distribution variability, and the critical need for preserving topological structure, making generalized vessel segmentation particularly complex. While specialized segmentation methods have been developed for specific anatomical regions, their over-reliance on tailored models hinders broader applicability and generalization. General-purpose segmentation models introduced in medical imaging often fail to address critical vascular characteristics, including the connectivity of segmentation results. To overcome these limitations, we propose an optimized vessel segmentation framework: a structure-agnostic approach incorporating small vessel enhancement and morphological correction for multi-modality vessel segmentation. To train and validate this framework, we compiled a comprehensive multi-modality dataset spanning 17 datasets and benchmarked our model against six SAM-based methods and 17 expert models. The results demonstrate that our approach achieves superior segmentation accuracy, generalization, and a 34.6% improvement in connectivity, underscoring its clinical potential. An ablation study further validates the effectiveness of the proposed improvements. We will release the code and dataset at github following the publication of this work.

artificial intelligence, machine learning, segmentation, (16 more...)

arXiv.org Artificial Intelligence

2411.15251

Country:

Asia > China (0.14)
Europe > Spain (0.14)
North America > Canada (0.14)
(2 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Exploring structure diversity in atomic resolution microscopy with graph neural networks

Luo, Zheng, Feng, Ming, Gao, Zijian, Yu, Jinyang, Hu, Liang, Wang, Tao, Xue, Shenao, Zhou, Shen, Ouyang, Fangping, Feng, Dawei, Xu, Kele, Wang, Shanshan

arXiv.org Artificial IntelligenceOct-23-2024

The emergence of deep learning (DL) has provided great opportunities for the high-throughput analysis of atomic-resolution micrographs. However, the DL models trained by image patches in fixed size generally lack efficiency and flexibility when processing micrographs containing diversified atomic configurations. Herein, inspired by the similarity between the atomic structures and graphs, we describe a few-shot learning framework based on an equivariant graph neural network (EGNN) to analyze a library of atomic structures (e.g., vacancies, phases, grain boundaries, doping, etc.), showing significantly promoted robustness and three orders of magnitude reduced computing parameters compared to the image-driven DL models, which is especially evident for those aggregated vacancy lines with flexible lattice distortion. Besides, the intuitiveness of graphs enables quantitative and straightforward extraction of the atomic-scale structural features in batches, thus statistically unveiling the self-assembly dynamics of vacancy lines under electron beam irradiation. A versatile model toolkit is established by integrating EGNN sub-models for single structure recognition to process images involving varied configurations in the form of a task chain, leading to the discovery of novel doping configurations with superior electrocatalytic properties for hydrogen evolution reactions. This work provides a powerful tool to explore structure diversity in a fast, accurate, and intelligent manner.

artificial intelligence, configuration, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.17631

Country: Asia > China (0.47)

Genre:

Research Report (0.82)
Workflow (0.68)

Industry:

Materials > Chemicals (1.00)
Semiconductors & Electronics (0.93)
Energy > Energy Storage (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

H2OVL-Mississippi Vision Language Models Technical Report

Galib, Shaikat, Wang, Shanshan, Xu, Guanshuo, Pfeiffer, Pascal, Chesler, Ryan, Landry, Mark, Ambati, Sri Satish

arXiv.org Artificial IntelligenceOct-17-2024

Smaller vision-language models (VLMs) are becoming increasingly important for privacy-focused, on-device applications due to their ability to run efficiently on consumer hardware for processing enterprise commercial documents and images. These models require strong language understanding and visual capabilities to enhance human-machine interaction. To address this need, we present H2OVL-Mississippi, a pair of small VLMs trained on 37 million image-text pairs using 240 hours of compute on 8 x H100 GPUs. H2OVL-Mississippi-0.8B is a tiny model with 0.8 billion parameters that specializes in text recognition, achieving state of the art performance on the Text Recognition portion of OCRBench and surpassing much larger models in this area. Additionally, we are releasing H2OVL-Mississippi-2B, a 2 billion parameter model for general use cases, exhibiting highly competitive metrics across various academic benchmarks. Both models build upon our prior work with H2O-Danube language models, extending their capabilities into the visual domain. We release them under the Apache 2.0 license, making VLMs accessible to everyone, democratizing document AI and visual LLMs.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.13611

Country: North America > United States > Mississippi (1.00)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

MedBench: A Comprehensive, Standardized, and Reliable Benchmarking System for Evaluating Chinese Medical Large Language Models

Liu, Mianxin, Ding, Jinru, Xu, Jie, Hu, Weiguo, Li, Xiaoyang, Zhu, Lifeng, Bai, Zhian, Shi, Xiaoming, Wang, Benyou, Song, Haitao, Liu, Pengfei, Zhang, Xiaofan, Wang, Shanshan, Li, Kang, Wang, Haofen, Ruan, Tong, Huang, Xuanjing, Sun, Xin, Zhang, Shaoting

arXiv.org Artificial IntelligenceJun-23-2024

Ensuring the general efficacy and goodness for human beings from medical large language models (LLM) before real-world deployment is crucial. However, a widely accepted and accessible evaluation process for medical LLM, especially in the Chinese context, remains to be established. In this work, we introduce "MedBench", a comprehensive, standardized, and reliable benchmarking system for Chinese medical LLM. First, MedBench assembles the currently largest evaluation dataset (300,901 questions) to cover 43 clinical specialties and performs multi-facet evaluation on medical LLM. Second, MedBench provides a standardized and fully automatic cloud-based evaluation infrastructure, with physical separations for question and ground truth. Third, MedBench implements dynamic evaluation mechanisms to prevent shortcut learning and answer remembering. Applying MedBench to popular general and medical LLMs, we observe unbiased, reproducible evaluation results largely aligning with medical professionals' perspectives. This study establishes a significant foundation for preparing the practical applications of Chinese medical LLMs. MedBench is publicly accessible at https://medbench.opencompass.org.cn.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2407.1099

Country: Asia > China (1.00)

Genre: Research Report (1.00)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

What is the Best Way for ChatGPT to Translate Poetry?

Wang, Shanshan, Wong, Derek F., Yao, Jingming, Chao, Lidia S.

arXiv.org Artificial IntelligenceJun-5-2024

Machine translation (MT) has historically faced significant challenges when applied to literary works, particularly in the domain of poetry translation. The advent of Large Language Models such as ChatGPT holds potential for innovation in this field. This study examines ChatGPT's capabilities in English-Chinese poetry translation tasks, utilizing targeted prompts and small sample scenarios to ascertain optimal performance. Despite promising outcomes, our analysis reveals persistent issues in the translations generated by ChatGPT that warrant attention. To address these shortcomings, we propose an Explanation-Assisted Poetry Machine Translation (EAPMT) method, which leverages monolingual poetry explanation as a guiding information for the translation process. Furthermore, we refine existing evaluation criteria to better suit the nuances of modern poetry translation. We engaged a panel of professional poets for assessments, complemented evaluations by using GPT-4. The results from both human and machine evaluations demonstrate that our EAPMT method outperforms traditional translation methods of ChatGPT and the existing online systems. This paper validates the efficacy of our method and contributes a novel perspective to machine-assisted literary translation.

large language model, machine learning, translation, (20 more...)

arXiv.org Artificial Intelligence

2406.0345

Country: Asia > China (0.46)

Genre: Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Dual-State Personalized Knowledge Tracing with Emotional Incorporation

Wang, Shanshan, Yuan, Fangzheng, Wang, Keyang, Yang, Xun, Zhang, Xingyi, Wang, Meng

arXiv.org Artificial IntelligenceMay-26-2024

Knowledge tracing has been widely used in online learning systems to guide the students' future learning. However, most existing KT models primarily focus on extracting abundant information from the question sets and explore the relationships between them, but ignore the personalized student behavioral information in the learning process. This will limit the model's ability to accurately capture the personalized knowledge states of students and reasonably predict their performances. To alleviate this limitation, we explicitly models the personalized learning process by incorporating the emotions, a representative personalized behavior in the learning process, into KT framework. Specifically, we present a novel Dual-State Personalized Knowledge Tracing with Emotional Incorporation model to achieve this goal: Firstly, we incorporate emotional information into the modeling process of knowledge state, resulting in the Knowledge State Boosting Module. Secondly, we design an Emotional State Tracing Module to monitor students' personalized emotional states, and propose an emotion prediction method based on personalized emotional states. Finally, we apply the predicted emotions to enhance students' response prediction. Furthermore, to extend the generalization capability of our model across different datasets, we design a transferred version of DEKT, named Transfer Learning-based Self-loop model (T-DEKT). Extensive experiments show our method achieves the state-of-the-art performance.

artificial intelligence, machine learning, student, (18 more...)

arXiv.org Artificial Intelligence

2405.16799

Country: Asia > China > Anhui Province (0.14)

Genre:

Instructional Material (0.92)
Research Report > New Finding (0.46)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.87)

Add feedback

Personalized Forgetting Mechanism with Concept-Driven Knowledge Tracing

Wang, Shanshan, Hu, Ying, Yang, Xun, Zhang, Zhongzhou, Wang, Keyang, Zhang, Xingyi

arXiv.org Artificial IntelligenceApr-25-2024

Knowledge Tracing (KT) aims to trace changes in students' knowledge states throughout their entire learning process by analyzing their historical learning data and predicting their future learning performance. Existing forgetting curve theory based knowledge tracing models only consider the general forgetting caused by time intervals, ignoring the individualization of students and the causal relationship of the forgetting process. To address these problems, we propose a Concept-driven Personalized Forgetting knowledge tracing model (CPF) which integrates hierarchical relationships between knowledge concepts and incorporates students' personalized cognitive abilities. First, we integrate the students' personalized capabilities into both the learning and forgetting processes to explicitly distinguish students' individual learning gains and forgetting rates according to their cognitive abilities. Second, we take into account the hierarchical relationships between knowledge points and design a precursor-successor knowledge concept matrix to simulate the causal relationship in the forgetting process, while also integrating the potential impact of forgetting prior knowledge points on subsequent ones. The proposed personalized forgetting mechanism can not only be applied to the learning of specifc knowledge concepts but also the life-long learning process. Extensive experimental results on three public datasets show that our CPF outperforms current forgetting curve theory based methods in predicting student performance, demonstrating CPF can better simulate changes in students' knowledge status through the personalized forgetting mechanism.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2404.12127

Country:

North America > United States > Hawaii (0.14)
Asia > China > Anhui Province (0.14)

Genre:

Research Report (0.82)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Education > Educational Setting (1.00)
Health & Medicine > Therapeutic Area (0.95)
Education > Educational Technology > Educational Software > Computer Based Training (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

Enhancing Out-of-Distribution Detection with Multitesting-based Layer-wise Feature Fusion

Li, Jiawei, Li, Sitong, Wang, Shanshan, Zeng, Yicheng, Tan, Falong, Xie, Chuanlong

arXiv.org Artificial IntelligenceMar-16-2024

Deploying machine learning in open environments presents the challenge of encountering diverse test inputs that differ significantly from the training data. These out-of-distribution samples may exhibit shifts in local or global features compared to the training distribution. The machine learning (ML) community has responded with a number of methods aimed at distinguishing anomalous inputs from original training data. However, the majority of previous studies have primarily focused on the output layer or penultimate layer of pre-trained deep neural networks. In this paper, we propose a novel framework, Multitesting-based Layer-wise Out-of-Distribution (OOD) Detection (MLOD), to identify distributional shifts in test samples at different levels of features through rigorous multiple testing procedure. Our approach distinguishes itself from existing methods as it does not require modifying the structure or fine-tuning of the pre-trained classifier. Through extensive experiments, we demonstrate that our proposed framework can seamlessly integrate with any existing distance-based inspection method while efficiently utilizing feature extractors of varying depths. Our scheme effectively enhances the performance of out-of-distribution detection when compared to baseline methods. In particular, MLOD-Fisher achieves superior performance in general. When trained using KNN on CIFAR10, MLOD-Fisher significantly lowers the false positive rate (FPR) from 24.09% to 7.47% on average compared to merely utilizing the features of the last layer.

artificial intelligence, detection, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2403.10803

Country:

Asia > China (0.28)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.84)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback