AITopics

2507.21158

Country:

Europe (0.46)
Oceania > Australia (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.94)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
(3 more...)

arXiv.org Artificial IntelligenceJul-30-2025

Zero-Shot Machine Unlearning with Proxy Adversarial Data Generation

Chen, Huiqiang, Zhu, Tianqing, Yu, Xin, Zhou, Wanlei

Machine unlearning aims to remove the influence of specific samples from a trained model. A key challenge in this process is over-unlearning, where the model's performance on the remaining data significantly drops due to the change in the model's parameters. Existing unlearning algorithms depend on the remaining data to prevent this issue. As such, these methods are inapplicable in a more practical scenario, where only the unlearning samples are available (i.e., zero-shot unlearning). This paper presents a novel framework, ZS-P AG, to fill this gap. Our approach offers three key innovations: (1) we approximate the inaccessible remaining data by generating adversarial samples; (2) leveraging the generated samples, we pinpoint a specific subspace to perform the unlearning process, therefore preventing over-unlearning in the challenging zero-shot scenario; and (3) we consider the influence of the unlearning process on the remaining samples and design an influence-based pseudo-labeling strategy. As a result, our method further improves the model's performance after unlearning. The proposed method holds a theoretical guarantee, and experiments on various benchmarks validate the effectiveness and superiority of our proposed method over several baselines.

adversarial sample, large language model, machine learning, (19 more...)

2507.21738

Country:

Oceania > Australia (0.28)
Asia (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Transfer or Self-Supervised? Bridging the Performance Gap in Medical Imaging

Zhao, Zehui, Alzubaidi, Laith, Zhang, Jinglan, Duan, Ye, Naseem, Usman, Gu, Yuantong

Transfer Learning Using a light-weight model trained with target dataset directly can outperform the pre-trained TL model using natural images.[30] Self-Supervised Learning Pre-trained SSL model using natural images does not perform well with target COVID-19 samples and need further guidance from user.Problem Summary: domain discrepancy during pre-training willdegrade pre-trained model's performance[31] Transfer Learning Utilising pre-trained TL model does not bring significant improvement tothe target medical dataset with an imbalanced sample distribution.[32] Self-Supervised Learning The imbalanced source and target datasets lead to poor model performance even after self-supervised pre-training.Problem Summary: neither TL or SSL methods show improvedperformance towards imbalanced datasets[33] Transfer Learning The complexity of model and pre-training process makes it hard to understand the results and reduce the reliability of predictions.[34] Self-Supervised Learning The pre-training process of SSL model is fully unsupervised, which raised the concern for whether the model have fully understand the target dataset or is making predictions based on random factors.Problem Summary: the complexity of knowledge transferringprocess raised concerns of model reliabilityTable 1: Four main issues that constrained the application of pre-train methods in the medical field are summarised here: 1. the performance gap between TL and SSL in different data modalities, 2. the domain mismatch gap between source and target domain, 3. the challenge of data imbalance scenarios, 4. the difficulty in model explainability and analysis.

artificial intelligence, inductive learning, machine learning, (18 more...)

2407.05592

Country:

Oceania > Australia > Queensland (0.04)
Oceania > Australia > New South Wales (0.04)
North America > United States > South Carolina (0.04)
Europe > France > Grand Est > Bas-Rhin > Strasbourg (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Gastroenterology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Teo, Rachel S. Y., Abdullaev, Laziz U., Nguyen, Tan M.

The Blessing and Curse of Dimensionality in Safety Alignment

arXiv.org Machine LearningJul-29-2025

The focus on safety alignment in large language models (LLMs) has increased significantly due to their widespread adoption across different domains. The scale of LLMs play a contributing role in their success, and the growth in parameter count follows larger hidden dimensions. In this paper, we hypothesize that while the increase in dimensions has been a key advantage, it may lead to emergent problems as well. These problems emerge as the linear structures in the activation space can be exploited, in the form of activation engineering, to circumvent its safety alignment. Through detailed visualizations of linear subspaces associated with different concepts, such as safety, across various model scales, we show that the curse of high-dimensional representations uniquely impacts LLMs. Further substantiating our claim, we demonstrate that projecting the representations of the model onto a lower dimensional subspace can preserve sufficient information for alignment while avoiding those linear structures. Empirical results confirm that such dimensional reduction significantly reduces susceptibility to jailbreaking through representation engineering. Building on our empirical validations, we provide theoretical insights into these linear jailbreaking methods relative to a model's hidden dimensions. Broadly speaking, our work posits that the high dimensions of a model's internal representations can be both a blessing and a curse in safety alignment.

large language model, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

2507.20333

Country:

Asia > Singapore (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
North America > Canada (0.04)
(8 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningJul-29-2025

Meta Fusion: A Unified Framework For Multimodality Fusion with Mutual Learning

Liang, Ziyi, Qu, Annie, Shahbaba, Babak

Developing effective multimodal data fusion strategies has become increasingly essential for improving the predictive power of statistical machine learning methods across a wide range of applications, from autonomous driving to medical diagnosis. Traditional fusion methods, including early, intermediate, and late fusion, integrate data at different stages, each offering distinct advantages and limitations. In this paper, we introduce Meta Fusion, a flexible and principled framework that unifies these existing strategies as special cases. Motivated by deep mutual learning and ensemble learning, Meta Fusion constructs a cohort of models based on various combinations of latent representations across modalities, and further boosts predictive performance through soft information sharing within the cohort. Our approach is model-agnostic in learning the latent representations, allowing it to flexibly adapt to the unique characteristics of each modality. Theoretically, our soft information sharing mechanism reduces the generalization error. Empirically, Meta Fusion consistently outperforms conventional fusion strategies in extensive simulation studies. We further validate our approach on real-world applications, including Alzheimer's disease detection and neural decoding.

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Machine Learning

2507.20089

Country:

North America > United States > California > Santa Barbara County > Santa Barbara (0.14)
North America > United States > California > Orange County > Irvine (0.14)
Europe > France > Grand Est > Bas-Rhin > Strasbourg (0.04)
(5 more...)

Genre: Research Report > New Finding (0.92)

Industry:

Information Technology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.93)
(2 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningJul-29-2025

IGNIS: A Robust Neural Network Framework for Constrained Parameter Estimation in Archimedean Copulas

Aich, Agnideep

Classical estimators, the cornerstones of statistical inference, face insurmountable challenges when applied to important emerging classes of Archimedean copulas. These models exhibit pathological properties, including numerically unstable densities, non-monotonic parameter-to-dependence mappings, and vanishingly small likelihood gradients, rendering methods like Maximum Likelihood (MLE) and Method of Moments (MoM) inconsistent or computationally infeasible. We introduce IGNIS, a unified neural estimation framework that sidesteps these barriers by learning a direct, robust mapping from data-driven dependency measures to the underlying copula parameter theta. IGNIS utilizes a multi-input architecture and a theory-guided output layer (softplus(z) + 1) to automatically enforce the domain constraint theta_hat >= 1. Trained and validated on four families (Gumbel, Joe, and the numerically challenging A1/A2), IGNIS delivers accurate and stable estimates for real-world financial and health datasets, demonstrating its necessity for reliable inference in modern, complex dependence models where traditional methods fail.

artificial intelligence, copula, machine learning, (14 more...)

arXiv.org Machine Learning

2505.22518

Country:

Oceania > New Zealand (0.04)
North America > United States > Louisiana > Lafayette Parish > Lafayette (0.04)

Genre: Research Report (0.82)

Industry:

Banking & Finance (1.00)
Health & Medicine > Public Health (0.46)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Gupta, Mohit, Bhowmick, Debjit, Beck, Ben

BikeVAE-GNN: A Variational Autoencoder-Augmented Hybrid Graph Neural Network for Sparse Bicycle Volume Estimation

Accurate link-level bicycle volume estimation is essential for informed urban and transport planning but it is challenged by extremely sparse count data in urban bicycling networks worldwide. We propose BikeVAE-GNN, a novel dual-task framework augmenting a Hybrid Graph Neural Network (GNN) with Variational Autoencoder (VAE) to estimate Average Daily Bicycle (ADB) counts, addressing sparse bicycle networks. The Hybrid-GNN combines Graph Convolutional Networks (GCN), Graph Attention Networks (GAT), and GraphSAGE to effectively model intricate spatial relationships in sparse networks while VAE generates synthetic nodes and edges to enrich the graph structure and enhance the estimation performance. BikeVAE-GNN simultaneously performs - regression for bicycling volume estimation and classification for bicycling traffic level categorization. We demonstrate the effectiveness of BikeVAE-GNN using OpenStreetMap data and publicly available bicycle count data within the City of Melbourne - where only 141 of 15,933 road segments have labeled counts (resulting in 99% count data sparsity). Our experiments show that BikeVAE-GNN outperforms machine learning and baseline GNN models, achieving a mean absolute error (MAE) of 30.82 bicycles per day, accuracy of 99% and F1-score of 0.99. Ablation studies further validate the effective role of Hybrid-GNN and VAE components. Our research advances bicycling volume estimation in sparse networks using novel and state-of-the-art approaches, providing insights for sustainable bicycling infrastructures.

artificial intelligence, bikevae-gnn, machine learning, (17 more...)

2507.19517

Country: Oceania > Australia (0.29)

Genre: Research Report (1.00)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

DrugPilot: LLM-based Parameterized Reasoning Agent for Drug Discovery

Li, Kun, Wu, Zhennan, Wang, Shoupeng, Wu, Jia, Pan, Shirui, Hu, Wenbin

Large language models (LLMs) integrated with autonomous agents hold significant potential for advancing scientific discovery through automated reasoning and task execution. However, applying LLM agents to drug discovery is still constrained by challenges such as large-scale multimodal data processing, limited task automation, and poor support for domain-specific tools. To overcome these limitations, we introduce DrugPilot, a LLM-based agent system with a parameterized reasoning architecture designed for end-to-end scientific workflows in drug discovery. DrugPilot enables multi-stage research processes by integrating structured tool use with a novel parameterized memory pool. The memory pool converts heterogeneous data from both public sources and user-defined inputs into standardized representations. This design supports efficient multi-turn dialogue, reduces information loss during data exchange, and enhances complex scientific decision-making. To support training and benchmarking, we construct a drug instruction dataset covering eight core drug discovery tasks. Under the Berkeley function-calling benchmark, DrugPilot significantly outperforms state-of-the-art agents such as ReAct and LoT, achieving task completion rates of 98.0%, 93.5%, and 64.0% for simple, multi-tool, and multi-turn scenarios, respectively. These results highlight DrugPilot's potential as a versatile agent framework for computational science domains requiring automated, interactive, and data-integrated reasoning.

large language model, machine learning, natural language, (19 more...)

2505.1394

Country: Oceania > Australia (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Dinh, Duc-Tai, Dinh, Duc Anh Khoa

ZSE-Cap: A Zero-Shot Ensemble for Image Retrieval and Prompt-Guided Captioning

We present ZSE-Cap (Zero-Shot Ensemble for Captioning), our 4th place system in Event-Enriched Image Analysis (EVENTA) shared task on article-grounded image retrieval and captioning. Our zero-shot approach requires no finetuning on the competition's data. For retrieval, we ensemble similarity scores from CLIP, SigLIP, and DINOv2. For captioning, we leverage a carefully engineered prompt to guide the Gemma 3 model, enabling it to link high-level events from the article to the visual content in the image. Our system achieved a final score of 0.42002, securing a top-4 position on the private test set, demonstrating the effectiveness of combining foundation models through ensembling and prompting. Our code is available at https://github.com/ductai05/ZSE-Cap.

artificial intelligence, large language model, natural language, (14 more...)

2507.20564

Country:

Oceania > Australia (0.28)
Europe > Switzerland (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

High-Performance Parallel Optimization of the Fish School Behaviour on the Setonix Platform Using OpenMP

Wang, Haitian, Qin, Long

This paper presents an in-depth investigation into the high-performance parallel optimization of the Fish School Behaviour (FSB) algorithm on the Setonix supercomputing platform using the OpenMP framework. Given the increasing demand for enhanced computational capabilities for complex, large-scale calculations across diverse domains, there's an imperative need for optimized parallel algorithms and computing structures. The FSB algorithm, inspired by nature's social behavior patterns, provides an ideal platform for parallelization due to its iterative and computationally intensive nature. This study leverages the capabilities of the Setonix platform and the OpenMP framework to analyze various aspects of multi-threading, such as thread counts, scheduling strategies, and OpenMP constructs, aiming to discern patterns and strategies that can elevate program performance. Experiments were designed to rigorously test different configurations, and our results not only offer insights for parallel optimization of FSB on Setonix but also provide valuable references for other parallel computational research using OpenMP. Looking forward, other factors, such as cache behavior and thread scheduling strategies at micro and macro levels, hold potential for further exploration and optimization.

artificial intelligence, reduction construct, scheduling strategy, (14 more...)

2507.20173

Country: Oceania > Australia (0.14)

Genre: Research Report > New Finding (0.35)

Industry: Education > Social Development & Welfare > Conduct & Behavior (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Architecture > Distributed Systems (0.49)