Overview
Tab2Visual: Overcoming Limited Data in Tabular Data Classification Using Deep Learning with Visual Representations
Mamdouh, Ahmed, El-Melegy, Moumen, Ali, Samia, Kikinis, Ron
This research addresses the challenge of limited data in tabular data classification, particularly prevalent in domains with constraints like healthcare. We propose Tab2Visual, a novel approach that transforms heterogeneous tabular data into visual representations, enabling the application of powerful deep learning models. Tab2Visual effectively addresses data scarcity by incorporating novel image augmentation techniques and facilitating transfer learning. We extensively evaluate the proposed approach on diverse tabular datasets, comparing its performance against a wide range of machine learning algorithms, including classical methods, tree-based ensembles, and state-of-the-art deep learning models specifically designed for tabular data. We also perform an in-depth analysis of factors influencing Tab2Visual's performance. Our experimental results demonstrate that Tab2Visual outperforms other methods in classification problems with limited tabular data.
Leveraging GPT-4o Efficiency for Detecting Rework Anomaly in Business Processes
Derakhshan, Mohammad, Ceravolo, Paolo, Mohammadi, Fatemeh
This paper investigates the effectiveness of GPT-4o-2024-08-06, one of the Large Language Models (LLM) from OpenAI, in detecting business process anomalies, with a focus on rework anomalies. In our study, we developed a GPT-4o-based tool capable of transforming event logs into a structured format and identifying reworked activities within business event logs. The analysis was performed on a synthetic dataset designed to contain rework anomalies but free of loops. To evaluate the anomaly detection capabilities of GPT 4o-2024-08-06, we used three prompting techniques: zero-shot, one-shot, and few-shot. These techniques were tested on different anomaly distributions, namely normal, uniform, and exponential, to identify the most effective approach for each case. The results demonstrate the strong performance of GPT-4o-2024-08-06. On our dataset, the model achieved 96.14% accuracy with one-shot prompting for the normal distribution, 97.94% accuracy with few-shot prompting for the uniform distribution, and 74.21% accuracy with few-shot prompting for the exponential distribution. These results highlight the model's potential as a reliable tool for detecting rework anomalies in event logs and how anomaly distribution and prompting strategy influence the model's performance.
evclust: Python library for evidential clustering
Soubeiga, Armel, Antoine, Violaine
A recent developing trend in clustering is the advancement of algorithms that not only identify clusters within data, but also express and capture the uncertainty of cluster membership. Evidential clustering addresses this by using the Dempster-Shafer theory of belief functions, a framework designed to manage and represent uncertainty. This approach results in a credal partition, a structured set of mass functions that quantify the uncertain assignment of each object to potential groups. The Python framework evclust, presented in this paper, offers a suite of efficient evidence clustering algorithms as well as tools for visualizing, evaluating and analyzing credal partitions.
Preference Alignment on Diffusion Model: A Comprehensive Survey for Image Generation and Editing
Wu, Sihao, Si, Xiaonan, Xing, Chi, Wang, Jianhong, Jin, Gaojie, Cheng, Guangliang, Zhang, Lijun, Huang, Xiaowei
The integration of preference alignment with diffusion models (DMs) has emerged as a transformative approach to enhance image generation and editing capabilities. Although integrating diffusion models with preference alignment strategies poses significant challenges for novices at this intersection, comprehensive and systematic reviews of this subject are still notably lacking. To bridge this gap, this paper extensively surveys preference alignment with diffusion models in image generation and editing. First, we systematically review cutting-edge optimization techniques such as reinforcement learning with human feedback (RLHF), direct preference optimization (DPO), and others, highlighting their pivotal role in aligning preferences with DMs. Then, we thoroughly explore the applications of aligning preferences with DMs in autonomous driving, medical imaging, robotics, and more. Finally, we comprehensively discuss the challenges of preference alignment with DMs. To our knowledge, this is the first survey centered on preference alignment with DMs, providing insights to drive future innovation in this dynamic area.
Federated Continual Learning: Concepts, Challenges, and Solutions
Hamedi, Parisa, Razavi-Far, Roozbeh, Hallaji, Ehsan
Federated Continual Learning (FCL) has emerged as a robust solution for collaborative model training in dynamic environments, where data samples are continuously generated and distributed across multiple devices. This survey provides a comprehensive review of FCL, focusing on key challenges such as heterogeneity, model stability, communication overhead, and privacy preservation. We explore various forms of heterogeneity and their impact on model performance. Solutions to non-IID data, resource-constrained platforms, and personalized learning are reviewed in an effort to show the complexities of handling heterogeneous data distributions. Next, we review techniques for ensuring model stability and avoiding catastrophic forgetting, which are critical in non-stationary environments. Privacy-preserving techniques are another aspect of FCL that have been reviewed in this work. This survey has integrated insights from federated learning and continual learning to present strategies for improving the efficacy and scalability of FCL systems, making it applicable to a wide range of real-world scenarios.
A Survey on Mamba Architecture for Vision Applications
Ibrahim, Fady, Liu, Guangjun, Wang, Guanghui
Transformers have become foundational for visual tasks such as object detection, semantic segmentation, and video understanding, but their quadratic complexity in attention mechanisms presents scalability challenges. To address these limitations, the Mamba architecture utilizes state-space models (SSMs) for linear scalability, efficient processing, and improved contextual awareness. This paper investigates Mamba architecture for visual domain applications and its recent advancements, including Vision Mamba (ViM) and VideoMamba, which introduce bidirectional scanning, selective scanning mechanisms, and spatiotemporal processing to enhance image and video understanding. Architectural innovations like position embeddings, cross-scan modules, and hierarchical designs further optimize the Mamba framework for global and local feature extraction. These advancements position Mamba as a promising architecture in computer vision research and applications.
ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates
Yang, Ling, Yu, Zhaochen, Cui, Bin, Wang, Mengdi
We present that hierarchical LLM reasoning via scaling thought templates can effectively optimize the reasoning search space and outperform the mathematical reasoning capabilities of powerful LLMs like OpenAI o1-preview and DeepSeek V3. We train our ReasonFlux-32B model with only 8 GPUs and introduces three innovations: (i) a structured and generic thought template library, containing around 500 high-level thought templates capable of generalizing to similar or relevant reasoning problems; (ii) performing hierarchical reinforcement learning on a sequence of thought templates instead of original long CoT data, optimizing a base LLM to plan out an optimal template trajectory for gradually handling complex problems; (iii) a brand new inference scaling system that enables hierarchical LLM reasoning by adaptively scaling thought templates at inference time. With a template trajectory containing sequential thought templates, our ReasonFlux-32B significantly advances math reasoning capabilities to state-of-the-art levels. Notably, on the MATH benchmark, it achieves an accuracy of 91.2% and surpasses o1-preview by 6.7%. On the USA Math Olympiad (AIME) benchmark, ReasonFlux-32B solves an average of 56.7% of problems, surpassing o1-preview and DeepSeek-V3 by 27% and 45%, respectively.
Latent Convergence Modulation in Large Language Models: A Novel Approach to Iterative Contextual Realignment
Porretta, Patricia, Pakenham, Sylvester, Ainsworth, Huxley, Chatten, Gregory, Allerton, Godfrey, Hollingsworth, Simon, Periwinkle, Vance
Token prediction stability remains a challenge in autoregressive generative models, where minor variations in early inference steps often lead to significant semantic drift over extended sequences. A structured modulation mechanism was introduced to regulate hidden state transitions, ensuring that latent representation trajectories remain aligned with prior contextual dependencies while preserving generative flexibility. The modulation framework was designed to function within transformer-based architectures, dynamically constraining representation evolution without imposing external memory dependencies or extensive architectural modifications. Empirical evaluations demonstrated that structured latent adjustments contributed to reductions in perplexity fluctuations, entropy variance, and lexical instability, improving coherence in long-form text generation. Gradient propagation stability was further analyzed, revealing that the modulation process led to smoother optimization pathways, mitigating erratic fluctuations in weight updates across successive inference steps. The computational efficiency of the modulation process was assessed, showing that its integration within transformer-based architectures introduced only marginal overhead while maintaining compatibility with existing optimization frameworks. The structured modulation constraints also influenced syntactic variation, preventing excessive repetition while maintaining balanced sentence length distributions. Comparative evaluations against baseline models reinforced the role of controlled latent state evolution in improving pronoun resolution, logical consistency, and contextual alignment across autoregressive text generation tasks.
Fairness in Multi-Agent AI: A Unified Framework for Ethical and Equitable Autonomous Systems
Ranjan, Rajesh, Gupta, Shailja, Singh, Surya Narayan
Rajesh Ranjan* (Carnegie Mellon University, USA) Shailja Gupta* (Carnegie Mellon University, USA) Surya Narayan Singh* (BIT Sindri, India) Abstract: Ensuring fairness in decentralized multi-agent systems presents significant challenges due to emergent biases, systemic inefficiencies, and conflicting agent incentives. This paper provides a comprehensive survey of fairness in multi-agent AI, introducing a novel framework where fairness is treated as a dynamic, emergent property of agent interactions. The framework integrates fairness constraints, bias mitigation strategies, and incentive mechanisms to align autonomous agent behaviors with societal values while balancing efficiency and robustness. Through empirical validation, we demonstrate that incorporating fairness constraints results in more equitable decision-making. Introduction As artificial intelligence (AI) systems evolve, Agentic AI --autonomous systems capable of independent decision-making and goal-setting--has emerged as a ...
Foundation Models for Anomaly Detection: Vision and Challenges
Ren, Jing, Tang, Tao, Jia, Hong, Fayek, Haytham, Li, Xiaodong, Ma, Suyu, Xu, Xiwei, Xia, Feng
Foundation Models for Anomaly Detection: Vision and Challenges Jing Ren 1, T ao T ang 2, Hong Jia 3, Haytham Fayek 1, Xiaodong Li 1, Suyu Ma 4, Xiwei Xu 4, and Feng Xia 1 1 RMIT University, Australia 2 University of South Australia, Australia 3 University of Melbourne, Australia 4 CSIRO's Data61, Australia {jing.ren, tao.tang }@ieee.org, Abstract As data continues to grow in volume and complexity across domains such as finance, manufacturing, and healthcare, effective anomaly detection is essential for identifying irregular patterns that may signal critical issues. Recently, foundation models (FMs) have emerged as a powerful tool for advancing anomaly detection. They have demonstrated unprecedented capabilities in enhancing anomaly identification, generating detailed data descriptions, and providing visual explanations. This survey presents the first comprehensive review of recent advancements in FM-based anomaly detection. We propose a novel taxonomy that classifies FMs into three categories based on their roles in anomaly detection tasks, i.e., as encoders, detectors, or interpreters. We provide a systematic analysis of state-of-the-art methods and discuss key challenges in leveraging FMs for improved anomaly detection. We also outline future research directions in this rapidly evolving field. 1 Introduction Anomaly detection, also known as outlier detection, is the process of identifying patterns or events in data that significantly deviate from expected behavior [ Chandola et al., 2009] .