Goto

Collaborating Authors

 Overview


SurveyG: A Multi-Agent LLM Framework with Hierarchical Citation Graph for Automated Survey Generation

arXiv.org Artificial Intelligence

Large language models (LLMs) are increasingly adopted for automating survey paper generation \cite{wang2406autosurvey, liang2025surveyx, yan2025surveyforge,su2025benchmarking,wen2025interactivesurvey}. Existing approaches typically extract content from a large collection of related papers and prompt LLMs to summarize them directly. However, such methods often overlook the structural relationships among papers, resulting in generated surveys that lack a coherent taxonomy and a deeper contextual understanding of research progress. To address these shortcomings, we propose \textbf{SurveyG}, an LLM-based agent framework that integrates \textit{hierarchical citation graph}, where nodes denote research papers and edges capture both citation dependencies and semantic relatedness between their contents, thereby embedding structural and contextual knowledge into the survey generation process. The graph is organized into three layers: \textbf{Foundation}, \textbf{Development}, and \textbf{Frontier}, to capture the evolution of research from seminal works to incremental advances and emerging directions. By combining horizontal search within layers and vertical depth traversal across layers, the agent produces multi-level summaries, which are consolidated into a structured survey outline. A multi-agent validation stage then ensures consistency, coverage, and factual accuracy in generating the final survey. Experiments, including evaluations by human experts and LLM-as-a-judge, demonstrate that SurveyG outperforms state-of-the-art frameworks, producing surveys that are more comprehensive and better structured to the underlying knowledge taxonomy of a field.


DeHate: A Stable Diffusion-based Multimodal Approach to Mitigate Hate Speech in Images

arXiv.org Artificial Intelligence

The rise in harmful online content not only distorts public discourse but also poses significant challenges to maintaining a healthy digital environment. In response to this, we introduce a multimodal dataset uniquely crafted for identifying hate in digital content. Central to our methodology is the innovative application of watermarked, stability-enhanced, stable diffusion techniques combined with the Digital Attention Analysis Module (DAAM). This combination is instrumental in pinpointing the hateful elements within images, thereby generating detailed hate attention maps, which are used to blur these regions from the image, thereby removing the hateful sections of the image. We release this data set as a part of the dehate shared task. This paper also describes the details of the shared task. Furthermore, we present DeHater, a vision-language model designed for multimodal dehatification tasks. Our approach sets a new standard in AI-driven image hate detection given textual prompts, contributing to the development of more ethical AI applications in social media.


Aegis: Automated Error Generation and Attribution for Multi-Agent Systems

arXiv.org Artificial Intelligence

Large language model based multi-agent systems (MAS) have unlocked significant advancements in tackling complex problems, but their increasing capability introduces a structural fragility that makes them difficult to debug. A key obstacle to improving their reliability is the severe scarcity of large-scale, diverse datasets for error attribution, as existing resources rely on costly and unscalable manual annotation. To address this bottleneck, we introduce Aegis, a novel framework for Automated error generation and attribution for multi-agent systems. Aegis constructs a large dataset of 9,533 trajectories with annotated faulty agents and error modes, covering diverse MAS architectures and task domains. This is achieved using a LLM-based manipulator that can adaptively inject context-aware errors into successful execution trajectories. Leveraging fine-grained labels and the structured arrangement of positive-negative sample pairs, Aegis supports three different learning paradigms: Supervised Fine-Tuning, Reinforcement Learning, and Contrastive Learning. We develop learning methods for each paradigm. Comprehensive experiments show that trained models consistently achieve substantial improvements in error attribution. Notably, several of our fine-tuned LLMs demonstrate performance competitive with or superior to proprietary models an order of magnitude larger, validating our automated data generation framework as a crucial resource for developing more robust and interpretable multi-agent systems. Our project website is available at https://kfq20.github.io/Aegis-Website/.


Structured Output Regularization: a framework for few-shot transfer learning

arXiv.org Machine Learning

Transfer learning is often used in deep learning when data is limited, such as in medical imaging applications (Kim et al., 2022). Foundation models, that is large, publicly available, pre-trained models, are often fine-tuned for such tasks where little data is available (Wang et al., 2023; Zhang and Metaxas, 2024; Khan et al., 2025). Beyond freezing part of a model to reduce overfitting, various techniques can increase training data such as data augmentation, and self supervised learning. These methods can reduce overfitting (Chollet, 2021; Wang et al., 2023; Ewen and Khan, 2021), but still struggle when there is little data available (Wang et al., 2023). We propose a new approach, Structured Output Regularization (SOR), a simple framework that adapts and prunes pretrained networks using very little labeled data. Instead of unfreezing internal weights, SOR keeps internal structures frozen, e.g., convolutional filters or higher-level blocks, and regularizes their outputs. Specifically, we freeze internal structure weights, we add new weights between each frozen structure, penalized via lasso penalty to encourage sparsity, and train the network. Structures whose new weights are driven to zero can be removed, yielding a smaller, task-tailored model without training the full parameter set. To regularize the final layer structures, SOR applies group lasso.







A survey and benchmark of high-dimensional Bayesian optimization of discrete sequences Miguel González-Duque

Neural Information Processing Systems

Optimizing discrete black box functions is key in several domains, e.g. protein engineering and drug design. Due to the lack of gradient information and the need for sample efficiency, Bayesian optimization is an ideal candidate for these tasks. Several methods for high-dimensional continuous and categorical Bayesian optimization have been proposed recently. However, our survey of the field reveals highly heterogeneous experimental set-ups across methods and technical barriers for the replicability and application of published algorithms to real-world tasks. To address these issues, we develop a unified framework to test a vast array of high-dimensional Bayesian optimization methods and a collection of standardized black box functions representing real-world application domains in chemistry and biology.