AITopics | self-supervised training

cfaea3a519edf73c3a0480ae8f00bc4e-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 05:17:04 GMT

artificial intelligence, gradient, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Self-SupervisedMulti-ObjectTrackingwithCross-InputConsistency (SupplementaryMaterial) FavyenBastani,Songtao He,SamMadden

Neural Information Processing SystemsFeb-9-2026, 08:26:58 GMT

For each training sequence hI0,...,Ini, Only-Occlusion randomly selects four indexes 0 < k1 k2 < k3 k4 < n to construct two disjoint frame subsequences hIk1,...,Ik2i and hIk3,...,Ik4i. Learning to merely compare detection features across consecutive frames would yield low accuracy since features in occluded frames are not observed. This strategy yields high consistency because it is unaffected by occluded intermediate frames. We select two indexes 0 < k5,k6 < n. Then, we randomly pick k5 and k6 such that k3 k5 k4 and k1 k6 k2, i.e., the hand-off for one tracker occurs when the other tracker observes a simulated occlusion.

artificial intelligence, machine learning, supplementarymaterial, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (0.35)
Information Technology > Artificial Intelligence > Machine Learning (0.32)

Add feedback

Analyzing the Sample Complexity of Self-Supervised Image Reconstruction Methods

Neural Information Processing SystemsDec-26-2025, 20:14:28 GMT

Supervised training of deep neural networks on pairs of clean image and noisy measurement achieves state-of-the-art performance for many image reconstruction tasks, but such training pairs are difficult to collect. Self-supervised methods enable training based on noisy measurements only, without clean images. In this work, we investigate the cost of self-supervised training in terms of sample complexity for a class of self-supervised methods that enable the computation of unbiased estimates of gradients of the supervised loss, including noise2noise methods. We analytically show that a model trained with such self-supervised training is as good as the same model trained in a supervised fashion, but self-supervised training requires more examples than supervised training. We then study self-supervised denoising and accelerated MRI empirically and characterize the cost of self-supervised training in terms of the number of additional samples required, and find that the performance gap between self-supervised and supervised training vanishes as a function of the training examples, at a problem-dependent rate, as predicted by our theory.

name change, sample complexity, self-supervised image reconstruction method, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.60)

Add feedback

Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations

Neural Information Processing SystemsDec-24-2025, 10:16:22 GMT

We present a neural analysis and synthesis (NANSY) framework that can manipulate the voice, pitch, and speed of an arbitrary speech signal. Most of the previous works have focused on using information bottleneck to disentangle analysis features for controllable synthesis, which usually results in poor reconstruction quality. We address this issue by proposing a novel training strategy based on information perturbation. The idea is to perturb information in the original input signal (e.g., formant, pitch, and frequency response), thereby letting synthesis networks selectively take essential attributes to reconstruct the input signal. Because NANSY does not need any bottleneck structures, it enjoys both high reconstruction quality and controllability. Furthermore, NANSY does not require any labels associated with speech data such as text and speaker information, but rather uses a new set of analysis features, i.e., wav2vec feature and newly proposed pitch feature, Yingram, which allows for fully self-supervised training. Taking advantage of fully self-supervised training, NANSY can be easily extended to a multilingual setting by simply training it with a multilingual dataset. The experiments show that NANSY can achieve significant improvement in performance in several applications such as zero-shot voice conversion, pitch shift, and time-scale modification.

neural analysis and synthesis, reconstructing speech, self-supervised representation, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.77)

Add feedback

cfaea3a519edf73c3a0480ae8f00bc4e-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 08:02:11 GMT

artificial intelligence, gradient, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

71e09b16e21f7b6919bbfc43f6a5b2f0-Supplemental.pdf

Neural Information Processing SystemsAug-15-2025, 04:02:49 GMT

detection, detector, tracker model, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Analyzing the Sample Complexity of Self-Supervised Image Reconstruction Methods

Neural Information Processing SystemsJan-19-2025, 22:48:41 GMT

Supervised training of deep neural networks on pairs of clean image and noisy measurement achieves state-of-the-art performance for many image reconstruction tasks, but such training pairs are difficult to collect. Self-supervised methods enable training based on noisy measurements only, without clean images. In this work, we investigate the cost of self-supervised training in terms of sample complexity for a class of self-supervised methods that enable the computation of unbiased estimates of gradients of the supervised loss, including noise2noise methods. We analytically show that a model trained with such self-supervised training is as good as the same model trained in a supervised fashion, but self-supervised training requires more examples than supervised training. We then study self-supervised denoising and accelerated MRI empirically and characterize the cost of self-supervised training in terms of the number of additional samples required, and find that the performance gap between self-supervised and supervised training vanishes as a function of the training examples, at a problem-dependent rate, as predicted by our theory.

sample complexity, self-supervised image reconstruction method, self-supervised training, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.64)

Add feedback

Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations

Neural Information Processing SystemsJan-13-2025, 20:17:13 GMT

We present a neural analysis and synthesis (NANSY) framework that can manipulate the voice, pitch, and speed of an arbitrary speech signal. Most of the previous works have focused on using information bottleneck to disentangle analysis features for controllable synthesis, which usually results in poor reconstruction quality. We address this issue by proposing a novel training strategy based on information perturbation. The idea is to perturb information in the original input signal (e.g., formant, pitch, and frequency response), thereby letting synthesis networks selectively take essential attributes to reconstruct the input signal. Because NANSY does not need any bottleneck structures, it enjoys both high reconstruction quality and controllability.

analysis and synthesis, neural analysis and synthesis, self-supervised representation, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.61)

Add feedback

Learned 3D volumetric recovery of clouds and its uncertainty for climate analysis

Ronen, Roi, Koren, Ilan, Levis, Aviad, Eytan, Eshkol, Holodovsky, Vadim, Schechner, Yoav Y.

arXiv.org Artificial IntelligenceMar-9-2024

Significant uncertainty in climate prediction and cloud physics is tied to observational gaps relating to shallow scattered clouds. Addressing these challenges requires remote sensing of their three-dimensional (3D) heterogeneous volumetric scattering content. This calls for passive scattering computed tomography (CT). We design a learning-based model (ProbCT) to achieve CT of such clouds, based on noisy multi-view spaceborne images. ProbCT infers - for the first time - the posterior probability distribution of the heterogeneous extinction coefficient, per 3D location. This yields arbitrary valuable statistics, e.g., the 3D field of the most probable extinction and its uncertainty. ProbCT uses a neural-field representation, making essentially real-time inference. ProbCT undergoes supervised training by a new labeled multi-class database of physics-based volumetric fields of clouds and their corresponding images. To improve out-of-distribution inference, we incorporate self-supervised learning through differential rendering. We demonstrate the approach in simulations and on real-world data, and indicate the relevance of 3D recovery and uncertainty to precipitation and renewable energy.

cloud, probability distribution, voxel, (15 more...)

arXiv.org Artificial Intelligence

2403.05932

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > Barbados (0.04)
North America > United States > Hawaii (0.04)
(6 more...)

Genre: Research Report (0.50)

Industry:

Energy > Renewable > Solar (1.00)
Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Certifiable 3D Object Pose Estimation: Foundations, Learning Models, and Self-Training

Talak, Rajat, Peng, Lisa, Carlone, Luca

arXiv.org Artificial IntelligenceApr-28-2023

We consider a certifiable object pose estimation problem, where -- given a partial point cloud of an object -- the goal is to not only estimate the object pose, but also to provide a certificate of correctness for the resulting estimate. Our first contribution is a general theory of certification for end-to-end perception models. In particular, we introduce the notion of $\zeta$-correctness, which bounds the distance between an estimate and the ground truth. We show that $\zeta$-correctness can be assessed by implementing two certificates: (i) a certificate of observable correctness, that asserts if the model output is consistent with the input data and prior information, (ii) a certificate of non-degeneracy, that asserts whether the input data is sufficient to compute a unique estimate. Our second contribution is to apply this theory and design a new learning-based certifiable pose estimator. We propose C-3PO, a semantic-keypoint-based pose estimation model, augmented with the two certificates, to solve the certifiable pose estimation problem. C-3PO also includes a keypoint corrector, implemented as a differentiable optimization layer, that can correct large detection errors (e.g. due to the sim-to-real gap). Our third contribution is a novel self-supervised training approach that uses our certificate of observable correctness to provide the supervisory signal to C-3PO during training. In it, the model trains only on the observably correct input-output pairs, in each training iteration. As training progresses, we see that the observably correct input-output pairs grow, eventually reaching near 100% in many cases. Our experiments show that (i) standard semantic-keypoint-based methods outperform more recent alternatives, (ii) C-3PO further improves performance and significantly outperforms all the baselines, and (iii) C-3PO's certificates are able to discern correct pose estimates.

artificial intelligence, certificate, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2206.11215

Country:

North America > United States > Massachusetts (0.28)
North America > United States > California (0.27)

Genre:

Research Report (0.63)
Personal > Honors (0.46)

Industry:

Transportation (0.46)
Energy > Oil & Gas > Upstream (0.35)
Leisure & Entertainment (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision > Video Understanding (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Filters

Collaborating Authors

self-supervised training

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

cfaea3a519edf73c3a0480ae8f00bc4e-Paper-Conference.pdf

Self-SupervisedMulti-ObjectTrackingwithCross-InputConsistency (SupplementaryMaterial) FavyenBastani,Songtao He,SamMadden

Analyzing the Sample Complexity of Self-Supervised Image Reconstruction Methods

Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations

cfaea3a519edf73c3a0480ae8f00bc4e-Paper-Conference.pdf

71e09b16e21f7b6919bbfc43f6a5b2f0-Supplemental.pdf

Analyzing the Sample Complexity of Self-Supervised Image Reconstruction Methods

Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations

Learned 3D volumetric recovery of clouds and its uncertainty for climate analysis

Certifiable 3D Object Pose Estimation: Foundations, Learning Models, and Self-Training