AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Jamie Hayes, Olga Ohrimenko

Contamination Attacks and Mitigation in Multi-Party Machine Learning

Neural Information Processing SystemsFeb-12-2026, 14:15:27 GMT

Wethen show how adversarialtraining can defend against such attacks by preventing the model from learningtrends specific to individual parties data, thereby also guaranteeing party-level membershipprivacy.

artificial intelligence, dataset, machine learning, (17 more...)

Country: North America > Canada > Quebec > Montreal (0.04)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Security & Privacy (0.68)

Neural Information Processing SystemsDec-25-2025, 05:21:55 GMT

Spectral Bias Outside the Training Set for Deep Networks in the Kernel Regime

We provide quantitative bounds measuring the $L^2$ difference in function space between the trajectory of a finite-width network trained on finitely many samples from the idealized kernel dynamics of infinite width and infinite data. An implication of the bounds is that the network is biased to learn the top eigenfunctions of the Neural Tangent Kernel not just on the training set but over the entire input space. This bias depends on the model architecture and input distribution alone and thus does not depend on the target function which does not need to be in the RKHS of the kernel. The result is valid for deep architectures with fully connected, convolutional, and residual layers. Furthermore the width does not need to grow polynomially with the number of samples in order to obtain high probability bounds up to a stopping time. The proof exploits the low-effective-rank property of the Fisher Information Matrix at initialization, which implies a low effective dimension of the model (far smaller than the number of parameters). We conclude that local capacity control from the low effective rank of the Fisher Information Matrix is still underexplored theoretically.

deep network, spectral bias, training set, (4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.80)

Dhrubo, Ehsan Ahmed, Alam, Mohammad Mahmudul, Raff, Edward, Oates, Tim, Holt, James

A Vector Symbolic Approach to Multiple Instance Learning

arXiv.org Artificial IntelligenceNov-24-2025

Multiple Instance Learning (MIL) tasks impose a strict logical constraint: a bag is labeled positive if and only if at least one instance within it is positive. While this iff constraint aligns with many real-world applications, recent work has shown that most deep learning-based MIL approaches violate it, leading to inflated performance metrics and poor generalization. We propose a novel MIL framework based on Vector Symbolic Architectures (VSAs), which provide a differentiable mechanism for performing symbolic operations in high-dimensional space. Our method encodes the MIL assumption directly into the model's structure by representing instances and concepts as nearly orthogonal high-dimensional vectors and using algebraic operations to enforce the iff constraint during classification. To bridge the gap between raw data and VSA representations, we design a learned encoder that transforms input instances into VSA-compatible vectors while preserving key distributional properties. Our approach, which includes a VSA-driven MaxNetwork classifier, achieves state-of-the-art results for a valid MIL model on standard MIL benchmarks and medical imaging datasets, outperforming existing methods while maintaining strict adherence to the MIL formulation. This work offers a principled, interpretable, and effective alternative to existing MIL approaches that rely on learned heuristics.

artificial intelligence, machine learning, natural language, (17 more...)

2511.16795

Country:

Europe (0.67)
North America > United States > Maryland (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.88)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Neural Information Processing SystemsNov-21-2025, 14:47:22 GMT

Data Programming: Creating Large Training Sets, Quickly

Large labeled training sets are the critical building blocks of supervised learning methods and are key enablers of deep learning techniques. For some applications, creating labeled training sets is the most time-consuming and expensive part of applying machine learning. We therefore propose a paradigm for the programmatic creation of training sets called data programming in which users provide a set of labeling functions, which are programs that heuristically label subsets of the data, but that are noisy and may conflict. By viewing these labeling functions as implicitly describing a generative model for this noise, we show that we can recover the parameters of this model to denoise the generated training set, and establish theoretically that we can recover the parameters of these generative models in a handful of settings. We then show how to modify a discriminative loss function to make it noise-aware, and demonstrate our method over a range of discriminative models including logistic regression and LSTMs. Experimentally, on the 2014 TAC-KBP Slot Filling challenge, we show that data programming would have led to a new winning score, and also show that applying data programming to an LSTM model leads to a TAC-KBP score almost 6 F1 points over a state-of-the-art LSTM baseline (and into second place in the competition). Additionally, in initial user studies we observed that data programming may be an easier way for non-experts to create machine learning models when training data is limited or unavailable.

data programming, name change, training set, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsOct-8-2025, 21:52:42 GMT

A Proof of Theorems

We still need to demonstrate that the properties in P AC-Bayes analysis hold for both the margin operator and the robust margin operator. Then we complete the proof of Lemma 6.1. The proof of Lemma 7.1 and 7.2 is similar. We provide the proof of Lemma 7.2 below. Lemma 7.1 follows the proof of Lemma 7.2 by replacing the robust margin operator by the margin Since the above bound holds for any x in the domain X, we can get the following a.s.: R The second inequality is the tail bound above.

artificial intelligence, machine learning, margin operator, (17 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Ceragioli, Leonardo, Primiero, Giuseppe

Trustworthiness Preservation by Copies of Machine Learning Systems

arXiv.org Artificial IntelligenceJun-6-2025

A common practice of ML systems development concerns the training of the same model under different data sets, and the use of the same (training and test) sets for different learning models. The first case is a desirable practice for identifying high quality and unbiased training conditions. The latter case coincides with the search for optimal models under a common dataset for training. These differently obtained systems have been considered akin to copies. In the quest for responsible AI, a legitimate but hardly investigated question is how to verify that trustworthiness is preserved by copies. In this paper we introduce a calculus to model and verify probabilistic complex queries over data and define four distinct notions: Justifiably, Equally, Weakly and Almost Trustworthy which can be checked analysing the (partial) behaviour of the copy with respect to its original. We provide a study of the relations between these notions of trustworthiness, and how they compose with each other and under logical operations. The aim is to offer a computational tool to check the trustworthiness of possibly complex systems copied from an original whose behavour is known.

artificial intelligence, logic & formal reasoning, machine learning, (20 more...)

2506.05203

Country: Europe (0.67)

Genre: Research Report (0.63)

Industry:

Health & Medicine (0.73)
Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)

arXiv.org Artificial IntelligenceMar-20-2025

CLIMB: Data Foundations for Large Scale Multimodal Clinical Foundation Models

Dai, Wei, Chen, Peilin, Lu, Malinda, Li, Daniel, Wei, Haowen, Cui, Hejie, Liang, Paul Pu

Recent advances in clinical AI have enabled remarkable progress across many clinical domains. However, existing benchmarks and models are primarily limited to a small set of modalities and tasks, which hinders the development of large-scale multimodal methods that can make holistic assessments of patient health and well-being. To bridge this gap, we introduce Clinical Large-Scale Integrative Multimodal Benchmark (CLIMB), a comprehensive clinical benchmark unifying diverse clinical data across imaging, language, temporal, and graph modalities. CLIMB comprises 4.51 million patient samples totaling 19.01 terabytes distributed across 2D imaging, 3D video, time series, graphs, and multimodal data. Through extensive empirical evaluation, we demonstrate that multitask pretraining significantly improves performance on understudied domains, achieving up to 29% improvement in ultrasound and 23% in ECG analysis over single-task learning. Pretraining on CLIMB also effectively improves models' generalization capability to new tasks, and strong unimodal encoder performance translates well to multimodal performance when paired with task-appropriate fusion strategies. Our findings provide a foundation for new architecture designs and pretraining strategies to advance clinical AI research. Code is released at https://github.com/DDVD233/climb.

large language model, machine learning, natural language, (20 more...)

2503.07667

Country:

South America > Brazil (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > Vietnam > Hanoi > Hanoi (0.04)
(23 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Dermatology (1.00)
(7 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Information Management (1.00)
Information Technology > Data Science (1.00)
(6 more...)

arXiv.org Artificial IntelligenceMar-8-2025

Feature Fusion Attention Network with CycleGAN for Image Dehazing, De-Snowing and De-Raining

Jain, Akshat

--This paper presents a novel approach to image dehazing by combining Feature Fusion Attention (FF A) networks with CycleGAN architecture. Our method leverages both supervised and unsupervised learning techniques to effectively remove haze from images while preserving crucial image details. The proposed hybrid architecture demonstrates significant improvements in image quality metrics, achieving superior PSNR and SSIM scores compared to traditional dehazing methods. Through extensive experimentation on the RESIDE and Dense-Haze CVPR 2019 dataset, we show that our approach effectively handles both synthetic and real-world hazy images. CycleGAN handles the unpaired nature of hazy and clean images effectively, enabling the model to learn mappings even without paired data.

artificial intelligence, cyclegan, machine learning, (16 more...)

2503.06107

Country: Asia > Taiwan > Taiwan Province > Taipei (0.04)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsFeb-11-2025, 19:19:14 GMT

Data Programming: Creating Large Training Sets, Quickly

Large labeled training sets are the critical building blocks of supervised learning methods and are key enablers of deep learning techniques. For some applications, creating labeled training sets is the most time-consuming and expensive part of applying machine learning. We therefore propose a paradigm for the programmatic creation of training sets called data programming in which users provide a set of labeling functions, which are programs that heuristically label subsets of the data, but that are noisy and may conflict. By viewing these labeling functions as implicitly describing a generative model for this noise, we show that we can recover the parameters of this model to "denoise" the generated training set, and establish theoretically that we can recover the parameters of these generative models in a handful of settings. We then show how to modify a discriminative loss function to make it noise-aware, and demonstrate our method over a range of discriminative models including logistic regression and LSTMs.

data programming, generative model, training set

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)