Goto

Collaborating Authors

BrainRotViT: Transformer-ResNet Hybrid for Explainable Modeling of Brain Aging from 3D sMRI

Jalal, Wasif, Rahman, Md Nafiu, Rahman, Atif Hasan, Rahman, M. Sohel

arXiv.org Artificial Intelligence

The human brain undergoes continuous transformations across the lifespan, representing a natural component of aging that does not inherently signal pathological conditions [1]. Neurodegenerative disorders such as dementia can compromise the brain structure and accelerate aging processes. Understanding and characterizing healthy brain aging patterns therefore becomes essential for distinguishing normal aging from pathological neurodegeneration, potentially enabling earlier detection of neurodegenerative diseases. The Brain Age-Gap (BAG), i.e. the discrepancy between predicted brain age and chronological age, has emerged as a robust biomarker that captures pathological brain processes and offers insights into the rate at which an individual's brain ages in comparison to others in the population [2, 3]. It is not only associated with various neurological disorders, such as Alzheimer's disease, cognitive impairment, and Autism Spectrum Disorder, but also serves as an indicator of all-cause mortality [4, 5, 6, 7, 8] Brain age estimation has been approached through both conventional and machine learning techniques, analyzing either the whole brain, specific regions, or localized patches [9, 10, 11]. One particular study presented a method using T1-weighted MRI to predict age through region-level and voxel-level metrics [12]. Regression-based machine learning has shown promise for the brain age prediction, with kernel regression applied to whole-brain MRI across diverse age ranges [13]. Various algorithms including Support Vector Regression and Binary Decision Trees have been compared for their brain age prediction capabilities [14]. Additional regression techniques such as Relevance Vector Regression, Twin Support Vector Regression, and Gaussian Process Regression have been explored across different imaging modalities for age estimation and mortality prediction [11, 15, 16, 17].


TESSERA: Temporal Embeddings of Surface Spectra for Earth Representation and Analysis

Feng, Zhengpeng, Atzberger, Clement, Jaffer, Sadiq, Knezevic, Jovana, Sormunen, Silja, Young, Robin, Lisaius, Madeline C., Immitzer, Markus, Jackson, Toby, Ball, James, Coomes, David A., Madhavapeddy, Anil, Blake, Andrew, Keshav, Srinivasan

arXiv.org Artificial Intelligence

Satellite Earth-observation (EO) time series in the optical and microwave ranges of the electromagnetic spectrum are often irregular due to orbital patterns and cloud obstruction. Compositing addresses these issues but loses information with respect to vegetation phenology, which is critical for many downstream tasks. Instead, we present TESSERA, a pixel-wise foundation model for multi-modal (Sentinel-1/2) EO time series that learns robust, label-efficient em-beddings. During model training, TESSERA uses Barlow Twins and sparse random temporal sampling to enforce invariance to the selection of valid observations. W e employ two key regularizers: global shuffling to decorrelate spatial neighborhoods and mix-based regulation to improve invariance under extreme sparsity. W e find that for diverse classification, segmentation, and regression tasks, TESSERA embeddings deliver state-of-the-art accuracy with high label efficiency, often requiring only a small task head and minimal computation. T o democratize access, adhere to F AIR principles, and simplify use, we release global, annual, 10m, pixel-wise int8 embeddings together with open weights/code and lightweight adaptation heads, thus providing practical tooling for large-scale retrieval and inference at planetary scale. The model training/inference code, downstream task code, and pre-generated embeddings can be accessed at https://github.com/ucam-eo.


Funabot-Upper: McKibben Actuated Haptic Suit Inducing Kinesthetic Perceptions in Trunk, Shoulder, Elbow, and Wrist

Fukatsu, Haru, Yasuda, Ryoji, Funabora, Yuki, Doki, Shinji

arXiv.org Artificial Intelligence

This paper presents Funabot-Upper, a wearable haptic suit that enables users to perceive 14 upper-body motions, including those of the trunk, shoulder, elbow, and wrist. Inducing kinesthetic perception through wearable haptic devices has attracted attention, and various devices have been developed in the past. However, these have been limited to verifications on single body parts, and few have applied the same method to multiple body parts as well. In our previous study, we developed a technology that uses the contraction of artificial muscles to deform clothing in three dimensions. Using this technology, we developed a haptic suit that induces kinesthetic perception of 7 motions in multiple upper body. However, perceptual mixing caused by stimulating multiple human muscles has occurred between the shoulder and the elbow. In this paper, we established a new, simplified design policy and developed a novel haptic suit that induces kinesthetic perceptions in the trunk, shoulder, elbow, and wrist by stimulating joints and muscles independently. We experimentally demonstrated the induced kinesthetic perception and examined the relationship between stimulation and perceived kinesthetic perception under the new design policy. Experiments confirmed that Funabot-Upper successfully induces kinesthetic perception across multiple joints while reducing perceptual mixing observed in previous designs. The new suit improved recognition accuracy from 68.8% to 94.6% compared to the previous Funabot-Suit, demonstrating its superiority and potential for future haptic applications.


Benchmarking Multi-Step Legal Reasoning and Analyzing Chain-of-Thought Effects in Large Language Models

Yu, Wenhan, Lin, Xinbo, Ni, Lanxin, Cheng, Jinhua, Sha, Lei

arXiv.org Artificial Intelligence

Large language models (LLMs) have demonstrated strong reasoning abilities across specialized domains, motivating research into their application to legal reasoning. However, existing legal benchmarks often conflate factual recall with genuine inference, fragment the reasoning process, and overlook the quality of reasoning. To address these limitations, we introduce MSLR, the first Chinese multi-step legal reasoning dataset grounded in real-world judicial decision making. MSLR adopts the IRAC framework (Issue, Rule, Application, Conclusion) to model structured expert reasoning from official legal documents. In addition, we design a scalable Human-LLM collaborative annotation pipeline that efficiently produces fine-grained step-level reasoning annotations and provides a reusable methodological framework for multi-step reasoning datasets. Evaluation of multiple LLMs on MSLR shows only moderate performance, highlighting the challenges of adapting to complex legal reasoning. Further experiments demonstrate that Self-Initiated Chain-of-Thought prompts generated by models autonomously improve reasoning coherence and quality, outperforming human-designed prompts. MSLR contributes to advancing LLM reasoning and Chain-of-Thought strategies and offers open resources for future research. The dataset and code are available at https://github.com/yuwenhan07/MSLR-Bench and https://law.sjtu.edu.cn/flszyjzx/index.html.


Evolution Strategies at the Hyperscale

Sarkar, Bidipta, Fellows, Mattie, Duque, Juan Agustin, Letcher, Alistair, Villares, Antonio León, Sims, Anya, Cope, Dylan, Liesen, Jarek, Seier, Lukas, Wolf, Theo, Berdica, Uljad, Goldie, Alexander David, Courville, Aaron, Sevegnani, Karin, Whiteson, Shimon, Foerster, Jakob Nicolaus

arXiv.org Artificial Intelligence

We introduce Evolution Guided General Optimization via Low-rank Learning (EGGROLL), an evolution strategies (ES) algorithm designed to scale backprop-free optimization to large population sizes for modern large neural network architectures with billions of parameters. ES is a set of powerful blackbox optimisation methods that can handle non-differentiable or noisy objectives with excellent scaling potential through parallelisation. Na{ï}ve ES becomes prohibitively expensive at scale due to the computational and memory costs associated with generating matrix perturbations $E\in\mathbb{R}^{m\times n}$ and the batched matrix multiplications needed to compute per-member forward passes. EGGROLL overcomes these bottlenecks by generating random matrices $A\in \mathbb{R}^{m\times r},\ B\in \mathbb{R}^{n\times r}$ with $r\ll \min(m,n)$ to form a low-rank matrix perturbation $A B^\top$ that are used in place of the full-rank perturbation $E$. As the overall update is an average across a population of $N$ workers, this still results in a high-rank update but with significant memory and computation savings, reducing the auxiliary storage from $mn$ to $r(m+n)$ per layer and the cost of a forward pass from $\mathcal{O}(mn)$ to $\mathcal{O}(r(m+n))$ when compared to full-rank ES. A theoretical analysis reveals our low-rank update converges to the full-rank update at a fast $\mathcal{O}\left(\frac{1}{r}\right)$ rate. Our experiments show that (1) EGGROLL does not compromise the performance of ES in tabula-rasa RL settings, despite being faster, (2) it is competitive with GRPO as a technique for improving LLM reasoning, and (3) EGGROLL enables stable pre-training of nonlinear recurrent language models that operate purely in integer datatypes.


SurvAgent: Hierarchical CoT-Enhanced Case Banking and Dichotomy-Based Multi-Agent System for Multimodal Survival Prediction

Huang, Guolin, Chen, Wenting, Yang, Jiaqi, Lyu, Xinheng, Luo, Xiaoling, Yang, Sen, Xing, Xiaohan, Shen, Linlin

arXiv.org Artificial Intelligence

Survival analysis is critical for cancer prognosis and treatment planning, yet existing methods lack the transparency essential for clinical adoption. While recent pathology agents have demonstrated explainability in diagnostic tasks, they face three limitations for survival prediction: inability to integrate multimodal data, ineffective region-of-interest exploration, and failure to leverage experiential learning from historical cases. We introduce SurvAgent, the first hierarchical chain-of-thought (CoT)-enhanced multi-agent system for multimodal survival prediction. SurvAgent consists of two stages: (1) WSI-Gene CoT-Enhanced Case Bank Construction employs hierarchical analysis through Low-Magnification Screening, Cross-Modal Similarity-Aware Patch Mining, and Confidence-Aware Patch Mining for pathology images, while Gene-Stratified analysis processes six functional gene categories. Both generate structured reports with CoT reasoning, storing complete analytical processes for experiential learning. (2) Dichotomy-Based Multi-Expert Agent Inference retrieves similar cases via RAG and integrates multimodal reports with expert predictions through progressive interval refinement. Extensive experiments on five TCGA cohorts demonstrate SurvAgent's superority over conventional methods, proprietary MLLMs, and medical agents, establishing a new paradigm for explainable AI-driven survival prediction in precision oncology.


MedBayes-Lite: Bayesian Uncertainty Quantification for Safe Clinical Decision Support

Hossain, Elias, Nipu, Md Mehedi Hasan, Sheikh, Maleeha, Rana, Rajib, Neupane, Subash, Yousefi, Niloofar

arXiv.org Artificial Intelligence

We propose MedBayes-Lite, a lightweight Bayesian enhancement for transformer-based clinical language models designed to produce reliable, uncertainty-aware predictions. Although transformers show strong potential for clinical decision support, they remain prone to overconfidence, especially in ambiguous medical cases where calibrated uncertainty is critical. MedBayes-Lite embeds uncertainty quantification directly into existing transformer pipelines without any retraining or architectural rewiring, adding no new trainable layers and keeping parameter overhead under 3 percent. The framework integrates three components: (i) Bayesian Embedding Calibration using Monte Carlo dropout for epistemic uncertainty, (ii) Uncertainty-Weighted Attention that marginalizes over token reliability, and (iii) Confidence-Guided Decision Shaping inspired by clinical risk minimization. Across biomedical QA and clinical prediction benchmarks (MedQA, PubMedQA, MIMIC-III), MedBayes-Lite consistently improves calibration and trustworthiness, reducing overconfidence by 32 to 48 percent. In simulated clinical settings, it can prevent up to 41 percent of diagnostic errors by flagging uncertain predictions for human review. These results demonstrate its effectiveness in enabling reliable uncertainty propagation and improving interpretability in medical AI systems.


LLMDistill4Ads: Using Cross-Encoders to Distill from LLM Signals for Advertiser Keyphrase Recommendations

Dey, Soumik, Braun, Benjamin, Ravipati, Naveen, Wu, Hansi, Li, Binbin

arXiv.org Artificial Intelligence

E-commerce sellers are advised to bid on keyphrases to boost their advertising campaigns. These keyphrases must be relevant to prevent irrelevant items from cluttering search systems and to maintain positive seller perception. It is vital that keyphrase suggestions align with seller, search and buyer judgments. Given the challenges in collecting negative feedback in these systems, LLMs have been used as a scalable proxy to human judgments. This paper presents an empirical study on a major ecommerce platform of a distillation framework involving an LLM teacher, a cross-encoder assistant and a bi-encoder Embedding Based Retrieval (EBR) student model, aimed at mitigating click-induced biases in keyphrase recommendations.


Finetuning LLMs for Automatic Form Interaction on Web-Browser in Selenium Testing Framework

Le, Nguyen-Khang, Nguyen, Hiep, Nguyen, Ngoc-Minh, Luu, Son T., Vo, Trung, Bui, Quan Minh, Nomura, Shoshin, Nguyen, Le-Minh

arXiv.org Artificial Intelligence

Automated web application testing is a critical component of modern software development, with frameworks like Selenium widely adopted for validating functionality through browser automation. Among the essential aspects of such testing is the ability to interact with and validate web forms, a task that requires syntactically correct, executable scripts with high coverage of input fields. Despite its importance, this task remains underexplored in the context of large language models (LLMs), and no public benchmark or dataset exists to evaluate LLMs on form interaction generation systematically. This paper introduces a novel method for training LLMs to generate high-quality test cases in Selenium, specifically targeting form interaction testing. We curate both synthetic and human-annotated datasets for training and evaluation, covering diverse real-world forms and testing scenarios. We define clear metrics for syntax correctness, script executability, and input field coverage. Our empirical study demonstrates that our approach significantly outperforms strong baselines, including GPT-4o and other popular LLMs, across all evaluation metrics. Our work lays the groundwork for future research on LLM-based web testing and provides resources to support ongoing progress in this area.


Do Not Merge My Model! Safeguarding Open-Source LLMs Against Unauthorized Model Merging

Li, Qinfeng, Pan, Miao, Chen, Jintao, Teng, Fu, Shen, Zhiqiang, Su, Ge, Peng, Hao, Zhang, Xuhong

arXiv.org Artificial Intelligence

Model merging has emerged as an efficient technique for expanding large language models (LLMs) by integrating specialized expert models. However, it also introduces a new threat: model merging stealing, where free-riders exploit models through unauthorized model merging. Unfortunately, existing defense mechanisms fail to provide effective protection. Specifically, we identify three critical protection properties that existing methods fail to simultaneously satisfy: (1) proactively preventing unauthorized merging; (2) ensuring compatibility with general open-source settings; (3) achieving high security with negligible performance loss. To address the above issues, we propose MergeBarrier, a plug-and-play defense that proactively prevents unauthorized merging. The core design of MergeBarrier is to disrupt the Linear Mode Connectivity (LMC) between the protected model and its homologous counterparts, thereby eliminating the low-loss path required for effective model merging. Extensive experiments show that MergeBarrier effectively prevents model merging stealing with negligible accuracy loss.