Goto

Collaborating Authors

 slr


Sparse PCA from Sparse Linear Regression

Neural Information Processing Systems

Sparse Principal Component Analysis (SPCA) and Sparse Linear Regression (SLR) have a wide range of applications and have attracted a tremendous amount of attention in the last two decades as canonical examples of statistical problems in high dimension. A variety of algorithms have been proposed for both SPCA and SLR, but an explicit connection between the two had not been made. We show how to efficiently transform a black-box solver for SLR into an algorithm for SPCA: assuming the SLR solver satisfies prediction error guarantees achieved by existing efficient algorithms such as those based on the Lasso, the SPCA algorithm derived from it achieves near state of the art guarantees for testing and for support recovery for the single spiked covariance model as obtained by the current best polynomial-time algorithms. Our reduction not only highlights the inherent similarity between the two problems, but also, from a practical standpoint, allows one to obtain a collection of algorithms for SPCA directly from known algorithms for SLR. We provide experimental results on simulated data comparing our proposed framework to other algorithms for SPCA.





CSMeD: Bridging the Dataset Gap in Automated Citation Screening for Systematic Literature Reviews

Neural Information Processing Systems

Systematic literature reviews (SLRs) play an essential role in summarising, synthesising and validating scientific evidence. In recent years, there has been a growing interest in using machine learning techniques to automate the identification of relevant studies for SLRs. However, the lack of standardised evaluation datasets makes comparing the performance of such automated literature screening systems difficult. In this paper, we analyse the citation screening evaluation datasets, revealing that many of the available datasets are either too small, suffer from data leakage or have limited applicability to systems treating automated literature screening as a classification task, as opposed to, for example, a retrieval or question-answering task. To address these challenges, we introduce CSMED, a meta-dataset consolidating nine publicly released collections, providing unified access to 325 SLRs from the fields of medicine and computer science. CSMED serves as a comprehensive resource for training and evaluating the performance of automated citation screening models. Additionally, we introduce CSMED-FT, a new dataset designed explicitly for evaluating the full text publication screening task. To demonstrate the utility of CSMED, we conduct experiments and establish baselines on new datasets.


Sparse PCA from Sparse Linear Regression

Neural Information Processing Systems

Sparse Principal Component Analysis (SPCA) and Sparse Linear Regression (SLR) have a wide range of applications and have attracted a tremendous amount of attention in the last two decades as canonical examples of statistical problems in high dimension. A variety of algorithms have been proposed for both SPCA and SLR, but an explicit connection between the two had not been made. We show how to efficiently transform a black-box solver for SLR into an algorithm for SPCA: assuming the SLR solver satisfies prediction error guarantees achieved by existing efficient algorithms such as those based on the Lasso, the SPCA algorithm derived from it achieves near state of the art guarantees for testing and for support recovery for the single spiked covariance model as obtained by the current best polynomial-time algorithms. Our reduction not only highlights the inherent similarity between the two problems, but also, from a practical standpoint, allows one to obtain a collection of algorithms for SPCA directly from known algorithms for SLR. We provide experimental results on simulated data comparing our proposed framework to other algorithms for SPCA.


Runtime Composition in Dynamic System of Systems: A Systematic Review of Challenges, Solutions, Tools, and Evaluation Methods

arXiv.org Artificial Intelligence

Context: Modern Systems of Systems (SoSs) increasingly operate in dynamic environments (e.g., smart cities, autonomous vehicles) where runtime composition -- the on-the-fly discovery, integration, and coordination of constituent systems (CSs)--is crucial for adaptability. Despite growing interest, the literature lacks a cohesive synthesis of runtime composition in dynamic SoSs. Objective: This study synthesizes research on runtime composition in dynamic SoSs and identifies core challenges, solution strategies, supporting tools, and evaluation methods. Methods: We conducted a Systematic Literature Review (SLR), screening 1,774 studies published between 2019 and 2024 and selecting 80 primary studies for thematic analysis (TA). Results: Challenges fall into four categories: modeling and analysis, resilient operations, system orchestration, and heterogeneity of CSs. Solutions span seven areas: co-simulation and digital twins, semantic ontologies, integration frameworks, adaptive architectures, middleware, formal methods, and AI-driven resilience. Service-oriented frameworks for composition and integration dominate tooling, while simulation platforms support evaluation. Interoperability across tools, limited cross-toolchain workflows, and the absence of standardized benchmarks remain key gaps. Evaluation approaches include simulation-based, implementation-driven, and human-centered studies, which have been applied in domains such as smart cities, healthcare, defense, and industrial automation. Conclusions: The synthesis reveals tensions, including autonomy versus coordination, the modeling-reality gap, and socio-technical integration. It calls for standardized evaluation metrics, scalable decentralized architectures, and cross-domain frameworks. The analysis aims to guide researchers and practitioners in developing and implementing dynamically composable SoSs.



Can Agents Judge Systematic Reviews Like Humans? Evaluating SLRs with LLM-based Multi-Agent System

arXiv.org Artificial Intelligence

Systematic Literature Reviews (SLRs) are foundational to evidence-based research but remain labor-intensive and prone to inconsistency across disciplines. We present an LLM-based SLR evaluation copilot built on a Multi-Agent System (MAS) architecture to assist researchers in assessing the overall quality of the systematic literature reviews. The system automates protocol validation, methodological assessment, and topic relevance checks using a scholarly database. Unlike conventional single-agent methods, our design integrates a specialized agentic approach aligned with PRISMA guidelines to support more structured and interpretable evaluations. We conducted an initial study on five published SLRs from diverse domains, comparing system outputs to expert-annotated PRISMA scores, and observed 84% agreement. While early results are promising, this work represents a first step toward scalable and accurate NLP-driven systems for interdisciplinary workflows and reveals their capacity for rigorous, domain-agnostic knowledge aggregation to streamline the review process.


Facts Fade Fast: Evaluating Memorization of Outdated Medical Knowledge in Large Language Models

arXiv.org Artificial Intelligence

The growing capabilities of Large Language Models (LLMs) show significant potential to enhance healthcare by assisting medical researchers and physicians. However, their reliance on static training data is a major risk when medical recommendations evolve with new research and developments. When LLMs memorize outdated medical knowledge, they can provide harmful advice or fail at clinical reasoning tasks. To investigate this problem, we introduce two novel question-answering (QA) datasets derived from systematic reviews: MedRevQA (16,501 QA pairs covering general biomedical knowledge) and MedChangeQA (a subset of 512 QA pairs where medical consensus has changed over time). Our evaluation of eight prominent LLMs on the datasets reveals consistent reliance on outdated knowledge across all models. We additionally analyze the influence of obsolete pre-training data and training strategies to explain this phenomenon and propose future directions for mitigation, laying the groundwork for developing more current and reliable medical AI systems.