Goto

Collaborating Authors

 Performance Analysis


Oxytrees: Model Trees for Bipartite Learning

arXiv.org Artificial Intelligence

Bipartite learning is a machine learning task that aims to predict interactions between pairs of instances. It has been applied to various domains, including drug-target interactions, RNA-disease associations, and regulatory network inference. Despite being widely investigated, current methods still present drawbacks, as they are often designed for a specific application and thus do not generalize to other problems or present scalability issues. To address these challenges, we propose Oxytrees: proxy-based biclustering model trees. Oxytrees compress the interaction matrix into row- and column-wise proxy matrices, significantly reducing training time without compromising predictive performance. We also propose a new leaf-assignment algorithm that significantly reduces the time taken for prediction. Finally, Oxytrees employ linear models using the Kronecker product kernel in their leaves, resulting in shallower trees and thus even faster training. Using 15 datasets, we compared the predictive performance of ensembles of Oxytrees with that of the current state-of-the-art. We achieved up to 30-fold improvement in training times compared to state-of-the-art biclustering forests, while demonstrating competitive or superior performance in most evaluation settings, particularly in the inductive setting. Finally, we provide an intuitive Python API to access all datasets, methods and evaluation measures used in this work, thus enabling reproducible research in this field.


Evolve the Method, Not the Prompts: Evolutionary Synthesis of Jailbreak Attacks on LLMs

arXiv.org Artificial Intelligence

Automated red teaming frameworks for Large Language Models (LLMs) have become increasingly sophisticated, yet they share a fundamental limitation: their jailbreak logic is confined to selecting, combining, or refining pre-existing attack strategies. This binds their creativity and leaves them unable to autonomously invent entirely new attack mechanisms. To overcome this gap, we introduce \textbf{EvoSynth}, an autonomous framework that shifts the paradigm from attack planning to the evolutionary synthesis of jailbreak methods. Instead of refining prompts, EvoSynth employs a multi-agent system to autonomously engineer, evolve, and execute novel, code-based attack algorithms. Crucially, it features a code-level self-correction loop, allowing it to iteratively rewrite its own attack logic in response to failure. Through extensive experiments, we demonstrate that EvoSynth not only establishes a new state-of-the-art by achieving an 85.5\% Attack Success Rate (ASR) against highly robust models like Claude-Sonnet-4.5, but also generates attacks that are significantly more diverse than those from existing methods. We release our framework to facilitate future research in this new direction of evolutionary synthesis of jailbreak methods. Code is available at: https://github.com/dongdongunique/EvoSynth.


R$^{2}$Seg: Training-Free OOD Medical Tumor Segmentation via Anatomical Reasoning and Statistical Rejection

arXiv.org Artificial Intelligence

Foundation models for medical image segmentation struggle under out-of-distribution (OOD) shifts, often producing fragmented false positives on OOD tumors. We introduce R$^{2}$Seg, a training-free framework for robust OOD tumor segmentation that operates via a two-stage Reason-and-Reject process. First, the Reason step employs an LLM-guided anatomical reasoning planner to localize organ anchors and generate multi-scale ROIs. Second, the Reject step applies two-sample statistical testing to candidates generated by a frozen foundation model (BiomedParse) within these ROIs. This statistical rejection filter retains only candidates significantly different from normal tissue, effectively suppressing false positives. Our framework requires no parameter updates, making it compatible with zero-update test-time augmentation and avoiding catastrophic forgetting. On multi-center and multi-modal tumor segmentation benchmarks, R$^{2}$Seg substantially improves Dice, specificity, and sensitivity over strong baselines and the original foundation models. Code are available at https://github.com/Eurekashen/R2Seg.


Adaptive Dual-Layer Web Application Firewall (ADL-WAF) Leveraging Machine Learning for Enhanced Anomaly and Threat Detection

arXiv.org Artificial Intelligence

Web Application Firewalls are crucial for protecting web applications against a wide range of cyber threats. Traditional Web Application Firewalls often struggle to effectively distinguish between malicious and legitimate traffic, leading to limited efficacy in threat detection. To overcome these limitations, this paper proposes an Adaptive Dual-Layer WAF employing a two-layered Machine Learning model designed to enhance the accuracy of anomaly and threat detection. The first layer employs a Decision Tree (DT) algorithm to detect anomalies by identifying traffic deviations from established normal patterns. The second layer employs Support Vector Machine to classify these anomalies as either threat anomalies or benign anomalies. Our Adaptive Dual Layer WAF incorporates comprehensive data pre-processing and feature engineering techniques and has been thoroughly evaluated using five large benchmark datasets. Evaluation using these datasets shows that ADL WAF achieves a detection accuracy of 99.88% and a precision of 100%, significantly enhancing anomaly detection and reducing false positives. These findings suggest that integrating machine learning techniques into WAFs can substantially improve web application security by providing more accurate and efficient threat detection.


LLM4SCREENLIT: Recommendations on Assessing the Performance of Large Language Models for Screening Literature in Systematic Reviews

arXiv.org Artificial Intelligence

Context: Large language models (LLMs) are released faster than users' ability to evaluate them rigorously. When LLMs underpin research, such as identifying relevant literature for systematic reviews (SRs), robust empirical assessment is essential. Objective: We identify and discuss key challenges in assessing LLM performance for selecting relevant literature, identify good (evaluation) practices, and propose recommendations. Method: Using a recent large-scale study as an example, we identify problems with the use of traditional metrics for assessing the performance of Gen-AI tools for identifying relevant literature in SRs. We analyzed 27 additional papers investigating this issue, extracted the performance metrics, and found both good practices and widespread problems, especially with the use and reporting of performance metrics for SR screening. Results: Major weaknesses included: i) a failure to use metrics that are robust to imbalanced data and do not directly indicate whether results are better than chance, e.g., the use of Accuracy, ii) a failure to consider the impact of lost evidence when making claims concerning workload savings, and iii) pervasive failure to report the full confusion matrix (or performance metrics from which it can be reconstructed) which is essential for future meta-analyses. On the positive side, we extract good (evaluation) practices on which our recommendations for researchers and practitioners, as well as policymakers, are built. Conclusions: SR screening evaluations should prioritize lost evidence/recall alongside chance-anchored and cost-sensitive Weighted MCC (WMCC) metric, report complete confusion matrices, treat unclassifiable outputs as referred-back positives for assessment, adopt leakage-aware designs with non-LLM baselines and open artifacts, and ground conclusions in cost-benefit analysis where FNs carry higher penalties than FPs.


FedTopo: Topology-Informed Representation Alignment in Federated Learning under Non-I.I.D. Conditions

arXiv.org Artificial Intelligence

Current federated-learning models deteriorate under heterogeneous (non-I.I.D.) client data, as their feature representations diverge and pixel-or patch-level objectives fail to capture the global topology which is essential for high-dimensional visual tasks. We propose FedT opo, a framework that integrates T opological-Guided Block Screening (TGBS) and T opological Embedding (TE) to leverage topological information, yielding coherently aligned cross-client representations by T opological Alignment Loss (T AL). First, Topology-Guided Block Screening (TGBS) automatically selects the most topology-informative block, i.e., the one with maximal topological separability, whose persistence-based signatures best distinguish within-versus between-class pairs, ensuring that subsequent analysis focuses on topology-rich features. Next, this block yields a compact Topological Embedding, which quantifies the topological information for each client. Finally, a Topological Alignment Loss (T AL) guides clients to maintain topological consistency with the global model during optimization, reducing representation drift across rounds. Experiments on Fashion-MNIST, CIFAR-10, and CIFAR-100 under four non-I.I.D. partitions show that FedTopo accelerates convergence and improves accuracy over strong baselines. Code is available in Supplementary Materials.


A Multicollinearity-Aware Signal-Processing Framework for Cross-$β$ Identification via X-ray Scattering of Alzheimer's Tissue

arXiv.org Artificial Intelligence

X-ray scattering measurements of in situ human brain tissue encode structural signatures of pathological cross-$β$ inclusions, yet systematic exploitation of these data for automated detection remains challenging due to substrate contamination, strong inter-feature correlations, and limited sample sizes. This work develops a three-stage classification framework for identifying cross-$β$ structural inclusions-a hallmark of Alzheimer's disease-in X-ray scattering profiles of post-mortem human brain. Stage 1 employs a Bayes-optimal classifier to separate mica substrate from tissue regions on the basis of their distinct scattering signatures. Stage 2 introduces a multicollinearityaware, class-conditional correlation pruning scheme with formal guarantees on the induced Bayes risk and approximation error, thereby reducing redundancy while retaining class-discriminative information. Stage 3 trains a compact neural network on the pruned feature set to detect the presence or absence of cross-$β$ fibrillar ordering. The top-performing model, optimized with a composite loss combining Focal and Dice objectives, attains a test F1-score of 84.30% using 11 of 211 candidate features and 174 trainable parameters. The overall framework yields an interpretable, theory-grounded strategy for data-limited classification problems involving correlated, high-dimensional experimental measurements, exemplified here by X-ray scattering profiles of neurodegenerative tissue.


CEDL: Centre-Enhanced Discriminative Learning for Anomaly Detection

arXiv.org Artificial Intelligence

Supervised anomaly detection methods perform well in identifying known anomalies that are well represented in the training set. However, they often struggle to generalise beyond the training distribution due to decision boundaries that lack a clear definition of normality. Existing approaches typically address this by regularising the representation space during training, leading to separate optimisation in latent and label spaces. The learned normality is therefore not directly utilised at inference, and their anomaly scores often fall within arbitrary ranges that require explicit mapping or calibration for probabilistic interpretation. To achieve unified learning of geometric normality and label discrimination, we propose Centre-Enhanced Discriminative Learning (CEDL), a novel supervised anomaly detection framework that embeds geometric normality directly into the discriminative objective. CEDL reparameterises the conventional sigmoid-derived prediction logit through a centre-based radial distance function, unifying geometric and discriminative learning in a single end-to-end formulation. This design enables interpretable, geometry-aware anomaly scoring without post-hoc thresholding or reference calibration. Extensive experiments on tabular, time-series, and image data demonstrate that CEDL achieves competitive and balanced performance across diverse real-world anomaly detection tasks, validating its effectiveness and broad applicability.


Evaluation of Multi- and Single-objective Learning Algorithms for Imbalanced Data

arXiv.org Artificial Intelligence

Many machine learning tasks aim to find models that work well not for a single, but for a group of criteria, often opposing ones. One such example is imbalanced data classification, where, on the one hand, we want to achieve the best possible classification quality for data from the minority class without degrading the classification quality of the majority class. One solution is to propose an aggregate learning criterion and reduce the multi-objective learning task to a single-criteria optimization problem. Unfortunately, such an approach is characterized by ambiguity of interpretation since the value of the aggregated criterion does not indicate the value of the component criteria. Hence, there are more and more proposals for algorithms based on multi-objective optimization (MOO), which can simultaneously optimize multiple criteria. However, such an approach results in a set of multiple non-dominated solutions (Pareto front). The selection of a single solution from the Pareto front is a challenge itself, and much attention is paid to the issue of how to select it considering user preferences, as well as how to compare solutions returned by different MOO algorithms among themselves. Thus, a significant gap has been identified in the classifier evaluation methodology, i.e., how to reliably compare methods returning single solutions with algorithms returning solutions in the form of Pareto fronts. To fill the aforementioned gap, this article proposes a new, reliable way of evaluating algorithms based on multi-objective algorithms with methods that return single solutions while pointing out solutions from a Pareto front tailored to the user's preferences. This work focuses only on algorithm comparison, not their learning. The algorithms selected for this study are illustrative to help understand the proposed approach.


AI-Enhanced IoT Systems for Predictive Maintenance and Affordability Optimization in Smart Microgrids: A Digital Twin Approach

arXiv.org Artificial Intelligence

This study presents an AI enhanced IoT framework for predictive maintenance and affordability optimization in smart microgrids using a Digital Twin modeling approach. The proposed system integrates real time sensor data, machine learning based fault prediction, and cost aware operational analytics to improve reliability and energy efficiency in distributed microgrid environments. By synchronizing physical microgrid components with a virtual Digital Twin, the framework enables early detection of component degradation, dynamic load management, and optimized maintenance scheduling. Experimental evaluations demonstrate improved predictive accuracy, reduced operational downtime, and measurable cost savings compared to baseline microgrid management methods. The findings highlight the potential of Digital Twin driven IoT architectures as a scalable solution for next generation intelligent and affordable energy systems.