AITopics | unet

Collaborating Authors

unet

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation

Neural Information Processing SystemsMar-21-2026, 11:33:26 GMT

Existing methodologies in open vocabulary 3D semantic segmentation primarily concentrate on establishing a unified feature space encompassing 3D, 2D, and textual modalities. Nevertheless, traditional techniques such as global feature alignment or vision-language model distillation tend to impose only approximate correspondence, struggling notably with delineating fine-grained segmentation boundaries. To address this gap, we propose a more meticulous mask-level alignment between 3D features and the 2D-text embedding space through a cross-modal mask reasoning framework, XMask3D. In our approach, we developed a mask generator based on the denoising UNet from a pre-trained diffusion model, leveraging its capability for precise textual control over dense pixel representations and enhancing the open-world adaptability of the generated masks. We further integrate 3D global features as implicit conditions into the pre-trained 2D denoising UNet, enabling the generation of segmentation masks with additional 3D geometry awareness. Subsequently, the generated 2D masks are employed to align mask-level 3D representations with the vision-language feature space, thereby augmenting the open vocabulary capability of 3D geometry embeddings.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.99)

Add feedback

ded98d28f82342a39f371c013dfb3058-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 13:13:18 GMT

machine learning, natural language, unet, (18 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel (0.04)
Asia > China > Guangdong Province (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
(2 more...)

Add feedback

Keypoint-Augmented Self-Supervised Learning for Medical Image Segmentation with Limited Annotation

Neural Information Processing SystemsFeb-16-2026, 21:18:45 GMT

Pretraining CNN models (i.e., UNet) through self-supervision has become a powerful approach to facilitate medical image segmentation under low annotation regimes.

artificial intelligence, machine learning, segmentation, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Arizona (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.68)

Add feedback

SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds Y anyu Li1,2, Huan Wang 1,2, Qing Jin

Neural Information Processing SystemsFeb-10-2026, 23:47:27 GMT

Not surprisingly, there are emerging efforts to speed up the inference of text-to-image diffusion models on mobile devices.

artificial intelligence, distillation, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (0.46)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

2cb274e6ce940f47beb8011d8ecb1462-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-7-2026, 22:54:14 GMT

artificial intelligence, constraint, relaxation, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.31)

Add feedback

ScaleLong: Towards More Stable Training of Diffusion Model via Scaling Network Long Skip Connection

Neural Information Processing SystemsDec-26-2025, 23:35:15 GMT

In diffusion models, UNet is the most popular network backbone, since its long skip connects (LSCs) to connect distant network blocks can aggregate long-distant information and alleviate vanishing gradient. Unfortunately, UNet often suffers from unstable training in diffusion models which can be alleviated by scaling its LSC coefficients smaller. However, theoretical understandings of the instability of UNet in diffusion models and also the performance improvement of LSC scaling remain absent yet. To solve this issue, we theoretically show that the coefficients of LSCs in UNet have big effects on the stableness of the forward and backward propagation and robustness of UNet. Specifically, the hidden feature and gradient of UNet at any layer can oscillate and their oscillation ranges are actually large which explains the instability of UNet training. Moreover, UNet is also provably sensitive to perturbed input, and predicts an output distant from the desired output, yielding oscillatory loss and thus oscillatory gradient. Besides, we also observe the theoretical benefits of the LSC coefficient scaling of UNet in the stableness of hidden features and gradient and also robustness. Finally, inspired by our theory, we propose an effective coefficient scaling framework ScaleLong that scales the coefficients of LSC in UNet and better improve the training stability of UNet. Experimental results on CIFAR10, CelebA, ImageNet and COCO show that our methods are superior to stabilize training, and yield about 1.5x training acceleration on different diffusion models with UNet or UViT backbones.

diffusion model, scalelong, unet, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Keypoint-Augmented Self-Supervised Learning for Medical Image Segmentation with Limited Annotation

Neural Information Processing SystemsDec-26-2025, 16:30:34 GMT

Pretraining CNN models (i.e., UNet) through self-supervision has become a powerful approach to facilitate medical image segmentation under low annotation regimes. Recent contrastive learning methods encourage similar global representations when the same image undergoes different transformations, or enforce invariance across different image/patch features that are intrinsically correlated. However, CNN-extracted global and local features are limited in capturing long-range spatial dependencies that are essential in biological anatomy. To this end, we present a keypoint-augmented fusion layer that extracts representations preserving both short-and long-range self-attention. In particular, we augment the CNN feature map at multiple scales by incorporating an additional input that learns long-range spatial self-attention among localized keypoint features.

keypoint-augmented self-supervised learning, medical image segmentation, name change, (6 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.65)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Time-aware UNet and super-resolution deep residual networks for spatial downscaling

Sipilä, Mika, Maggio, Sabrina, De Iaco, Sandra, Nordhausen, Klaus, Palma, Monica, Taskinen, Sara

arXiv.org Machine LearningDec-17-2025

Satellite data of atmospheric pollutants are often available only at coarse spatial resolution, limiting their applicability in local-scale environmental analysis and decision-making. Spatial downscaling methods aim to transform the coarse satellite data into high-resolution fields. In this work, two widely used deep learning architectures, the super-resolution deep residual network (SRDRN) and the encoder-decoder-based UNet, are considered for spatial downscaling of tropospheric ozone. Both methods are extended with a lightweight temporal module, which encodes observation time using either sinusoidal or radial basis function (RBF) encoding, and fuses the temporal features with the spatial representations in the networks. The proposed time-aware extensions are evaluated against their baseline counterparts in a case study on ozone downscaling over Italy. The results suggest that, while only slightly increasing computational complexity, the temporal modules significantly improve downscaling performance and convergence speed.

architecture, convolutional layer, information, (15 more...)

arXiv.org Machine Learning

2512.13753

Country:

Europe > Austria > Vienna (0.14)
Asia > China (0.04)
Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
(7 more...)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Axial-UNet: A Neural Weather Model for Precipitation Nowcasting

Mamtani, Sumit, Sonawane, Maitreya

arXiv.org Artificial IntelligenceDec-1-2025

Accurately predicting short-term precipitation is critical for weather-sensitive applications such as disaster management, aviation, and urban planning. Traditional numerical weather prediction can be computationally intensive at high resolution and short lead times. In this work, we propose a lightweight UNet-based encoder-decoder augmented with axial-attention blocks that attend along image rows and columns to capture long-range spatial interactions, while temporal context is provided by conditioning on multiple past radar frames. Our hybrid architecture captures both local and long-range spatio-temporal dependencies from radar image sequences, enabling fixed lead-time precipitation nowcasting with modest compute. Experimental results on a preprocessed subset of the HKO-7 radar dataset demonstrate that our model outperforms ConvLSTM, pix2pix-style cGANs, and a plain UNet in pixel-fidelity metrics, reaching PSNR 47.67 and SSIM 0.9943. We report PSNR/SSIM here; extending evaluation to meteorology-oriented skill measures (e.g., CSI/FSS) is left to future work. The approach is simple, scalable, and effective for resource-constrained, real-time forecasting scenarios.

artificial intelligence, machine learning, sequence, (19 more...)

arXiv.org Artificial Intelligence

2504.19408

Genre: Research Report (0.82)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)

Add feedback

Filters

Collaborating Authors

unet

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation

fd348179ec677c5560d4cd9c3ffb6cd9-AuthorFeedback.pdf

ded98d28f82342a39f371c013dfb3058-Paper-Conference.pdf

Keypoint-Augmented Self-Supervised Learning for Medical Image Segmentation with Limited Annotation

SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds Y anyu Li1,2, Huan Wang 1,2, Qing Jin

2cb274e6ce940f47beb8011d8ecb1462-AuthorFeedback.pdf

ScaleLong: Towards More Stable Training of Diffusion Model via Scaling Network Long Skip Connection

Keypoint-Augmented Self-Supervised Learning for Medical Image Segmentation with Limited Annotation

Time-aware UNet and super-resolution deep residual networks for spatial downscaling

Axial-UNet: A Neural Weather Model for Precipitation Nowcasting