Goto

Collaborating Authors

 Overview


Rebalancing the Scales: A Systematic Mapping Study of Generative Adversarial Networks (GANs) in Addressing Data Imbalance

arXiv.org Artificial Intelligence

Machine learning algorithms are used in diverse domains, many of which face significant challenges due to data imbalance. Studies have explored various approaches to address the issue, like data preprocessing, cost-sensitive learning, and ensemble methods. Generative Adversarial Networks (GANs) showed immense potential as a data preprocessing technique that generates good quality synthetic data. This study employs a systematic mapping methodology to analyze 3041 papers on GAN-based sampling techniques for imbalanced data sourced from four digital libraries. A filtering process identified 100 key studies spanning domains such as healthcare, finance, and cybersecurity. Through comprehensive quantitative analysis, this research introduces three categorization mappings as application domains, GAN techniques, and GAN variants used to handle the imbalanced nature of the data. GAN-based over-sampling emerges as an effective preprocessing method. Advanced architectures and tailored frameworks helped GANs to improve further in the case of data imbalance. GAN variants like vanilla GAN, CTGAN, and CGAN show great adaptability in structured imbalanced data cases. Interest in GANs for imbalanced data has grown tremendously, touching a peak in recent years, with journals and conferences playing crucial roles in transmitting foundational theories and practical applications. While with these advances, none of the reviewed studies explicitly explore hybridized GAN frameworks with diffusion models or reinforcement learning techniques. This gap leads to a future research idea develop innovative approaches for effectively handling data imbalance.


Deep learning approaches to surgical video segmentation and object detection: A Scoping Review

arXiv.org Artificial Intelligence

Introduction: Computer vision (CV) has had a transformative impact in biomedical fields such as radiology, dermatology, and pathology. Its real-world adoption in surgical applications, however, remains limited. We review the current state-of-the-art performance of deep learning (DL)-based CV models for segmentation and object detection of anatomical structures in videos obtained during surgical procedures. Methods: We conducted a scoping review of studies on semantic segmentation and object detection of anatomical structures published between 2014 and 2024 from 3 major databases - PubMed, Embase, and IEEE Xplore. The primary objective was to evaluate the state-of-the-art performance of semantic segmentation in surgical videos. Secondary objectives included examining DL models, progress toward clinical applications, and the specific challenges with segmentation of organs/tissues in surgical videos. Results: We identified 58 relevant published studies. These focused predominantly on procedures from general surgery [20(34.4%)], colorectal surgery [9(15.5%)], and neurosurgery [8(13.8%)]. Cholecystectomy [14(24.1%)] and low anterior rectal resection [5(8.6%)] were the most common procedures addressed. Semantic segmentation [47(81%)] was the primary CV task. U-Net [14(24.1%)] and DeepLab [13(22.4%)] were the most widely used models. Larger organs such as the liver (Dice score: 0.88) had higher accuracy compared to smaller structures such as nerves (Dice score: 0.49). Models demonstrated real-time inference potential ranging from 5-298 frames-per-second (fps). Conclusion: This review highlights the significant progress made in DL-based semantic segmentation for surgical videos with real-time applicability, particularly for larger organs. Addressing challenges with smaller structures, data availability, and generalizability remains crucial for future advancements.


A Review of Causal Decision Making

arXiv.org Machine Learning

To make effective decisions, it is important to have a thorough understanding of the causal relationships among actions, environments, and outcomes. This review aims to surface three crucial aspects of decision-making through a causal lens: 1) the discovery of causal relationships through causal structure learning, 2) understanding the impacts of these relationships through causal effect learning, and 3) applying the knowledge gained from the first two aspects to support decision making via causal policy learning. Moreover, we identify challenges that hinder the broader utilization of causal decision-making and discuss recent advances in overcoming these challenges. Finally, we provide future research directions to address these challenges and to further enhance the implementation of causal decision-making in practice, with real-world applications illustrated based on the proposed causal decision-making. We aim to offer a comprehensive methodology and practical implementation framework by consolidating various methods in this area into a Python-based collection. URL: https://causaldm.github.io/Causal-Decision-Making.


A Review of Artificial Intelligence Impacting Statistical Process Monitoring and Future Directions

arXiv.org Artificial Intelligence

It has been 100 years since statistical process control (SPC) or statistical process monitoring (SPM) was first introduced for production processes and later applied to service, healthcare, and other industries. The techniques applied to SPM applications are mostly statistically oriented. Recent advances in Artificial Intelligence (AI) have reinvigorated the imagination of adopting AI for SPM applications. This manuscript begins with a concise review of the historical development of the statistically based SPM methods. Next, this manuscript explores AI and Machine Learning (ML) algorithms and methods applied in various SPM applications, addressing quality characteristics of univariate, multivariate, profile, and image. These AI methods can be classified into the following categories: classification, pattern recognition, time series applications, and generative AI. Specifically, different kinds of neural networks, such as artificial neural networks (ANN), convolutional neural networks (CNN), recurrent neural networks (RNN), and generative adversarial networks (GAN), are among the most implemented AI methods impacting SPM. Finally, this manuscript outlines a couple of future directions that harness the potential of the Large Multimodal Model (LMM) for advancing SPM research and applications in complex systems. The ultimate objective is to transform statistical process monitoring (SPM) into smart process control (SMPC), where corrective actions are autonomously implemented to either prevent quality issues or restore process performance.


A Comprehensive Survey of Machine Unlearning Techniques for Large Language Models

arXiv.org Artificial Intelligence

This study investigates the machine unlearning techniques within the context of large language models (LLMs), referred to as \textit{LLM unlearning}. LLM unlearning offers a principled approach to removing the influence of undesirable data (e.g., sensitive or illegal information) from LLMs, while preserving their overall utility without requiring full retraining. Despite growing research interest, there is no comprehensive survey that systematically organizes existing work and distills key insights; here, we aim to bridge this gap. We begin by introducing the definition and the paradigms of LLM unlearning, followed by a comprehensive taxonomy of existing unlearning studies. Next, we categorize current unlearning approaches, summarizing their strengths and limitations. Additionally, we review evaluation metrics and benchmarks, providing a structured overview of current assessment methodologies. Finally, we outline promising directions for future research, highlighting key challenges and opportunities in the field.


A Systematic Review of Open Datasets Used in Text-to-Image (T2I) Gen AI Model Safety

arXiv.org Artificial Intelligence

This work is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). For the definitive version, see 10.1109/ACCESS.2025.3539933. Disclaimer: This research involves topics that may include disturbing results. Any explicit content has been redacted, and potentially disturbing results have been presented in a neutral and anonymized manner to minimize emotional distress to the readers. Abstract --Novel research aimed at text-to-image (T2I) generative AI safety often relies on publicly available datasets for training and evaluation, making the quality and composition of these datasets crucial. This paper presents a comprehensive review of the key datasets used in the T2I research, detailing their collection methods, compositions, semantic and syntactic diversity of prompts and the quality, coverage, and distribution of harm types in the datasets. By highlighting the strengths and limitations of the datasets, this study enables researchers to find the most ...


Robustness and Cybersecurity in the EU Artificial Intelligence Act

arXiv.org Artificial Intelligence

The EU Artificial Intelligence Act (AIA) establishes different legal principles for different types of AI systems. While prior work has sought to clarify some of these principles, little attention has been paid to robustness and cybersecurity. This paper aims to fill this gap. We identify legal challenges and shortcomings in provisions related to robustness and cybersecurity for high-risk AI systems (Art. 15 AIA) and general-purpose AI models (Art. 55 AIA). We show that robustness and cybersecurity demand resilience against performance disruptions. Furthermore, we assess potential challenges in implementing these provisions in light of recent advancements in the machine learning (ML) literature. Our analysis informs efforts to develop harmonized standards, guidelines by the European Commission, as well as benchmarks and measurement methodologies under Art. 15(2) AIA. With this, we seek to bridge the gap between legal terminology and ML research, fostering a better alignment between research and implementation efforts.


A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models

arXiv.org Artificial Intelligence

The rise of foundation models has transformed machine learning research, prompting efforts to uncover their inner workings and develop more efficient and reliable applications for better control. While significant progress has been made in interpreting Large Language Models (LLMs), multimodal foundation models (MMFMs) - such as contrastive vision-language models, generative vision-language models, and text-to-image models - pose unique interpretability challenges beyond unimodal frameworks. Despite initial studies, a substantial gap remains between the interpretability of LLMs and MMFMs. This survey explores two key aspects: (1) the adaptation of LLM interpretability methods to multimodal models and (2) understanding the mechanistic differences between unimodal language models and crossmodal systems. By systematically reviewing current MMFM analysis techniques, we propose a structured taxonomy of interpretability methods, compare insights across unimodal and multimodal architectures, and highlight critical research gaps.


Active Learning Classification from a Signal Separation Perspective

arXiv.org Artificial Intelligence

In machine learning, classification is usually seen as a function approximation problem, where the goal is to learn a function that maps input features to class labels. In this paper, we propose a novel clustering and classification framework inspired by the principles of signal separation. This approach enables efficient identification of class supports, even in the presence of overlapping distributions. We validate our method on real-world hyperspectral datasets Salinas and Indian Pines. The experimental results demonstrate that our method is competitive with the state of the art active learning algorithms by using a very small subset of data set as training points.


Personhood Credentials: Human-Centered Design Recommendation Balancing Security, Usability, and Trust

arXiv.org Artificial Intelligence

Building on related concepts, like, decentralized identifiers (DIDs), proof of personhood, anonymous credentials, personhood credentials (PHCs) emerged as an alternative approach, enabling individuals to verify to digital service providers that they are a person without disclosing additional information. However, new technologies might introduce some friction due to users misunderstandings and mismatched expectations. Despite their growing importance, limited research has been done on users perceptions and preferences regarding PHCs. To address this gap, we conducted competitive analysis, and semi-structured online user interviews with 23 participants from US and EU to provide concrete design recommendations for PHCs that incorporate user needs, adoption rules, and preferences. Our study -- (a)surfaces how people reason about unknown privacy and security guarantees of PHCs compared to current verification methods -- (b) presents the impact of several factors on how people would like to onboard and manage PHCs, including, trusted issuers (e.g. gov), ground truth data to issue PHC (e.g biometrics, physical id), and issuance system (e.g. centralized vs decentralized). In a think-aloud conceptual design session, participants recommended -- conceptualized design, such as periodic biometrics verification, time-bound credentials, visually interactive human-check, and supervision of government for issuance system. We propose actionable designs reflecting users preferences.