south china university
MM-UNet: Morph Mamba U-shaped Convolutional Networks for Retinal Vessel Segmentation
Liu, Jiawen, Zeng, Yuanbo, Liang, Jiaming, Yang, Yizhen, Zhang, Yiheng, Cai, Enhui, Sheng, Xiaoqi, Cai, Hongmin
Accurate detection of retinal vessels plays a critical role in reflecting a wide range of health status indicators in the clinical diagnosis of ocular diseases. Recently, advances in deep learning have led to a surge in retinal vessel segmentation methods, which have significantly contributed to the quantitative analysis of vascular morphology. However, retinal vasculature differs significantly from conventional segmentation targets in that it consists of extremely thin and branching structures, whose global morphology varies greatly across images. These characteristics continue to pose challenges to segmentation precision and robustness. To address these issues, we propose MM-UNet, a novel architecture tailored for efficient retinal vessel segmentation. The model incorporates Morph Mamba Convolution layers, which replace pointwise convolutions to enhance branching topological perception through morph, state-aware feature sampling. Additionally, Reverse Selective State Guidance modules integrate reverse guidance theory with state-space modeling to improve geometric boundary awareness and decoding efficiency. Extensive experiments conducted on two public retinal vessel segmentation datasets demonstrate the superior performance of the proposed method in segmentation accuracy. Compared to the existing approaches, MM-UNet achieves F1-score gains of 1.64 % on DRIVE and 1.25 % on STARE, demonstrating its effectiveness and advancement. The project code is public via https://github.com/liujiawen-jpg/MM-UNet.
ExplicitLM: Decoupling Knowledge from Parameters via Explicit Memory Banks
Yu, Chengzhang, Lu, Zening, Zheng, Chenyang, Wang, Chiyue, Zhang, Yiming, Jin, Zhanpeng
Large language models (LLMs) universally suffer from knowledge staleness and lack of interpretability due to their implicit knowledge storage paradigm, where information is distributed across network parameters in an entangled, non-addressable manner. This fundamental limitation prevents targeted knowledge updates, verification of stored information, and understanding of model reasoning processes. We propose ExplicitLM, a novel architecture that fundamentally reimagines knowledge storage in language models through an explicit, interpretable memory bank system. Our key innovation introduces a million-scale external memory bank where each entry stores human-readable knowledge as token sequences, enabling direct inspection and modification of the model's knowledge base. To efficiently access this massive repository, we design a differentiable two-stage retrieval mechanism that enables end-to-end training while maintaining discrete knowledge selection, combining efficient coarse-grained filtering with product key decomposition (reducing computational complexity from O(N |I|) to O( N |I|)) and fine-grained similarity matching through Gumbel-Softmax. Drawing inspiration from dual-system cognitive theory, we partition knowledge into frozen explicit facts (20%) and learnable implicit patterns (80%), maintained through an Exponential Moving Average update strategy that ensures training stability.
A Confidence-Diversity Framework for Calibrating AI Judgement in Accessible Qualitative Coding Tasks
LLMs enable qualitative coding at large scale, but assessing reliability remains challenging where human experts seldom agree. We investigate confidence-diversity calibration as a quality assessment framework for accessible coding tasks where LLMs already demonstrate strong performance but exhibit overconfidence. Analysing 5,680 coding decisions from eight state-of-the-art LLMs across ten categories, we find that mean self-confidence tracks inter-model agreement closely (Pearson r=0.82). Adding model diversity quantified as normalised Shannon entropy produces a dual signal explaining agreement almost completely (R-squared=0.979), though this high predictive power likely reflects task simplicity for current LLMs. The framework enables a three-tier workflow auto-accepting 35 percent of segments with less than 5 percent error, cutting manual effort by 65 percent. Cross-domain validation confirms transferability (kappa improvements of 0.20 to 0.78). While establishing a methodological foundation for AI judgement calibration, the true potential likely lies in more challenging scenarios where LLMs may demonstrate comparative advantages over human cognitive limitations.
Accelerate Neural Subspace-Based Reduced-Order Solver of Deformable Simulation by Lipschitz Optimization
Lyu, Aoran, Zhao, Shixian, Xian, Chuhua, Cen, Zhihao, Cai, Hongmin, Fang, Guoxin
Reduced-order simulation is an emerging method for accelerating physical simulations with high DOFs, and recently developed neural-network-based methods with nonlinear subspaces have been proven effective in diverse applications as more concise subspaces can be detected. However, the complexity and landscape of simulation objectives within the subspace have not been optimized, which leaves room for enhancement of the convergence speed. This work focuses on this point by proposing a general method for finding optimized subspace mappings, enabling further acceleration of neural reduced-order simulations while capturing comprehensive representations of the configuration manifolds. We achieve this by optimizing the Lipschitz energy of the elasticity term in the simulation objective, and incorporating the cubature approximation into the training process to manage the high memory and time demands associated with optimizing the newly introduced energy. Our method is versatile and applicable to both supervised and unsupervised settings for optimizing the parameterizations of the configuration manifolds. We demonstrate the effectiveness of our approach through general cases in both quasi-static and dynamics simulations. Our method achieves acceleration factors of up to 6.83 while consistently preserving comparable simulation accuracy in various cases, including large twisting, bending, and rotational deformations with collision handling. This novel approach offers significant potential for accelerating physical simulations, and can be a good add-on to existing neural-network-based solutions in modeling complex deformable objects.
Zero-shot Explainable Mental Health Analysis on Social Media by Incorporating Mental Scales
Li, Wenyu, Zhu, Yinuo, Lin, Xin, Li, Ming, Jiang, Ziyue, Zeng, Ziqian
Traditional discriminative approaches in mental health analysis are known for their strong capacity but lack interpretability and demand large-scale annotated data. The generative approaches, such as those based on large language models (LLMs), have the potential to get rid of heavy annotations and provide explanations but their capabilities still fall short compared to discriminative approaches, and their explanations may be unreliable due to the fact that the generation of explanation is a black-box process. Inspired by the psychological assessment practice of using scales to evaluate mental states, our method which is called Mental Analysis by Incorporating Mental Scales (MAIMS), incorporates two procedures via LLMs. First, the patient completes mental scales, and second, the psychologist interprets the collected information from the mental scales and makes informed decisions. Experimental results show that MAIMS outperforms other zero-shot methods. MAIMS can generate more rigorous explanation based on the outputs of mental scales
Multi-Mobile Robot Localization and Navigation based on Visible Light Positioning
Chen, Yanyi, Zhong, Zhiqing, Wen, Shangsheng, Guan, Weipeng
Abstract: We demonstrated multi-mobile robot navigation based on Visible Light Positioning(VLP) localization. From our experiment, the VLP can accurately locate robots' positions in navigation. 1. Overview With the rapid development of robot technology, robotic control is promising in research and commerce. Mobile robots are also widely used in a variety of environments. Especially in the automated warehouse, robots have the advantage of being cheaper and more efficient than human beings for simple and repetitive missions. However, Robotic Positioning is a fundamental ability of multi-mobile robot system. Outdoors, global positioning system(GPS) sometimes give us attenuated location information signal impacted by reflections and penetration[1].
A Cooperative Positioning Flamework for Robot and Smart Phone Based on Visible Light Communication
Chen, Junye, Li, Fangdi, An, Futong, Yang, Chen, Song, Hongzhan, Wen, Shangsheng, Guan, Weipeng
Abstract: A cooperative positioning flamework of human and robots based on visible light communication (VLC) is proposed. Based on the experiment system, we demonstrated it is feasible and has high-accuracy and real-time performance. Because the walls block the transmission of signals, the performance of traditional positioning technologies such as Global Positioning System (GPS) positioning decreases significantly when used indoors, which has a large positioning error that even could not be used at all. Visible light positioning (VLP) systems compile the identifier (ID)-position information into a modulated signal by Microcontroller Unit (MCU) and adopt on-off keying (OOK) modulation to modulate the LEDs, transmitting information by changing the on-off state of the LEDs. Then, for the receiving terminal, by the rolling shutter effect of CMOS, the on-off state of the LED is captured in the form of stripes, so that the data rate of optical camera communication (OCC) can be boosted to a level much higher than the video frame rate.
Research on Metro Service Quality Improvement Schemes Considering Feasibility
Weiya, Chen, Jiajia, Li, Zixuan, Kang
It is an important management task of metro agencies to formulate reasonable improvement schemes based on the result of service quality surveys. Considering scores, weights, and improvement feasibility of service quality attributes in a certain period, this paper integrates Decision Tree (DT) into Importance-Performance analysis (IPA) to build a DT-IPA model, which is used to determine the improvement priority of attributes, and to quantify the improvement degree. If-then rules extracted from the optimal decision tree and the improvement feasibility computed by analytic hierarchy process are two main items derived from the DT-IPA model. They are used to optimize the initial improvement priority of attributes determined by IPA and to quantify the degree of improvement of the adjusted attributes. Then, the overall service quality can reach a high score, realizing the operation goal. The effectiveness of the DT-IPA model was verified through an empirical study which was taken place in Changsha Metro, China. The proposed method can be a decision-making tool for metro agency managers to improve the quality of metro service.
Priority prediction of Asian Hornet sighting report using machine learning methods
Liu, Yixin, Guo, Jiaxin, Dong, Jieyang, Jiang, Luoqian, Ouyang, Haoyuan
As infamous invaders to the North American ecosystem, the Asian giant hornet (Vespa mandarinia) is devastating not only to native bee colonies, but also to local apiculture. One of the most effective way to combat the harmful species is to locate and destroy their nests. By mobilizing the public to actively report possible sightings of the Asian giant hornet, the governmentcould timely send inspectors to confirm and possibly destroy the nests. However, such confirmation requires lab expertise, where manually checking the reports one by one is extremely consuming of human resources. Further given the limited knowledge of the public about the Asian giant hornet and the randomness of report submission, only few of the numerous reports proved positive, i.e. existing nests. How to classify or prioritize the reports efficiently and automatically, so as to determine the dispatch of personnel, is of great significance to the control of the Asian giant hornet. In this paper, we propose a method to predict the priority of sighting reports based on machine learning. We model the problem of optimal prioritization of sighting reports as a problem of classification and prediction. We extracted a variety of rich features in the report: location, time, image(s), and textual description. Based on these characteristics, we propose a classification model based on logistic regression to predict the credibility of a certain report. Furthermore, our model quantifies the impact between reports to get the priority ranking of the reports. Extensive experiments on the public dataset from the WSDA (the Washington State Department of Agriculture) have proved the effectiveness of our method.
A nomogram based on CT deep learning signature
Xianyue Quan Department of Radiology, Zhujiang Hospital, Southern Medical University, Guangzhou, 510282, People's Republic of China Tel/Fax 86-2061643114 Email quanxianyue2014@163.com Purpose: To develop and further validate a deep learning signature-based nomogram from computed tomography (CT) images for prediction of the overall survival (OS) in resected non-small cell lung cancer (NSCLC) patients. Patients and Methods: A total of 1792 deep learning features were extracted from non-enhanced and venous-phase CT images for each NSCLC patient in training cohort (n 231). Then, a deep learning signature was built with the least absolute shrinkage and selection operator (LASSO) Cox regression model for OS estimation. At last, a nomogram was constructed with the signature and other independent clinical risk factors. The performance of nomogram was assessed by discrimination, calibration and clinical usefulness.