AITopics | image content

Reconstructing the Image Stitching Pipeline: Integrating Fusion and Rectangling into a Unified Inpainting Model

Neural Information Processing SystemsApr-30-2026, 13:40:27 GMT

Deep learning-based image stitching pipelines are typically divided into three cascading stages: registration, fusion, and rectangling. Each stage requires its own network training and is tightly coupled to the others, leading to error propagation and posing significant challenges to parameter tuning and system stability. This paper proposes the Simple and Robust Stitcher (SRStitcher), which revolutionizes the image stitching pipeline by simplifying the fusion and rectangling stages into a unified inpainting model, requiring no model training or fine-tuning. We reformulate the problem definitions of the fusion and rectangling stages and demonstrate that they can be effectively integrated into an inpainting task. Furthermore, we design the weighted masks to guide the reverse process in a pre-trained largescale diffusion model, implementing this integrated inpainting task in a single inference. Through extensive experimentation, we verify the interpretability and generalization capabilities of this unified model, demonstrating that SRStitcher outperforms state-of-the-art methods in both performance and stability.

artificial intelligence, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: Asia (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

e393677793767624f2821cec8bdd02f1-Paper-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 02:26:54 GMT

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)

Add feedback

Image Restoration Using Very Deep Convolutional Encoder-Decoder Networks with Symmetric Skip Connections

Xiaojiao Mao, Chunhua Shen, Yu-Bin Yang

Neural Information Processing SystemsMar-23-2026, 01:25:24 GMT

In this paper, we propose a very deep fully convolutional encoding-decoding framework for image restoration such as denoising and super-resolution. The network is composed of multiple layers of convolution and deconvolution operators, learning end-to-end mappings from corrupted images to the original ones. The convolutional layers act as the feature extractor, which capture the abstraction of image contents while eliminating noises/corruptions. Deconvolutional layers are then used to recover the image details. We propose to symmetrically link convolutional and deconvolutional layers with skip-layer connections, with which the training converges much faster and attains a higher-quality local optimum. First, the skip connections allow the signal to be back-propagated to bottom layers directly, and thus tackles the problem of gradient vanishing, making training deep networks easier and achieving restoration performance gains consequently. Second, these skip connections pass image details from convolutional layers to deconvolutional layers, which is beneficial in recovering the original image. Significantly, with the large capacity, we can handle different levels of noises using a single model. Experimental results show that our network achieves better performance than recent state-of-the-art methods.

artificial intelligence, machine learning, skip connection, (16 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reconstructing the Image Stitching Pipeline: Integrating Fusion and Rectangling into a Unified Inpainting Model

Neural Information Processing SystemsFeb-18-2026, 10:23:09 GMT

To address the error propagation problem, we identify image fusion as the key point for improvement.

information, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > Thailand > Phuket > Phuket (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction

Neural Information Processing SystemsFeb-17-2026, 15:21:20 GMT

This paper aims to efficiently enable Large Language Models (LLMs) to use multi-modal tools. Advanced proprietary LLMs, such as ChatGPT and GPT -4, have shown great potential for tool usage through sophisticated prompt engineering.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Asia > China > Guangdong Province > Shenzhen (0.05)
Asia > Middle East > Jordan (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ae2d574d2c309f3a45880e4460efd176-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 13:17:41 GMT

artificial intelligence, imagenet-vid, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.96)

Add feedback

Explicitly disentangling image content from translation and rotation with spatial-VAE

Tristan Bepler, Ellen Zhong, Kotaro Kelley, Edward Brignole, Bonnie Berger

Neural Information Processing SystemsFeb-12-2026, 07:07:48 GMT

Neural Information Processing Systems http://nips.cc/

dataset, latent variable, rotation and translation, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)
North America > Canada (0.04)

Industry: Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Vision (0.88)
(2 more...)

Add feedback

Explicitly disentangling image content from translation and rotation with spatial-VAE

Neural Information Processing SystemsDec-25-2025, 10:32:59 GMT

Given an image dataset, we are often interested in finding data generative factors that encode semantic content independently from pose variables such as rotation and translation. However, current disentanglement approaches do not impose any specific structure on the learned latent representations. We propose a method for explicitly disentangling image rotation and translation from other unstructured latent factors in a variational autoencoder (VAE) framework. By formulating the generative model as a function of the spatial coordinate, we make the reconstruction error differentiable with respect to latent translation and rotation parameters. This formulation allows us to train a neural network to perform approximate inference on these latent variables while explicitly constraining them to only represent rotation and translation. We demonstrate that this framework, termed spatial-VAE, effectively learns latent representations that disentangle image rotation and translation from content and improves reconstruction over standard VAEs on several benchmark datasets, including applications to modeling continuous 2-D views of proteins from single particle electron microscopy and galaxies in astronomical images.

image content, rotation and translation, translation and rotation, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.60)

Add feedback

Interpreting Structured Perturbations in Image Protection Methods for Diffusion Models

Martin, Michael R., Chan, Garrick, Ma, Kwan-Liu

arXiv.org Artificial IntelligenceDec-10-2025

Recent image protection mechanisms such as Glaze and Nightshade introduce imperceptible, adversarially designed perturbations intended to disrupt downstream text-to-image generative models. While their empirical effectiveness has been demonstrated, the internal structure, detectability, and representational behavior of these perturbations remain poorly understood. In this study, we demonstrated a systematic explainable AI analysis of image protection perturbations using a unified framework that integrates white-box feature-space inspection and black-box signal-level probing. Through latent-space clustering, feature-channel activation analysis, occlusion-based spatial sensitivity mapping, and frequency-domain spectral characterization, we revealed that modern protection mechanisms operate as structured, low-entropy perturbations that remain tightly coupled to underlying image content across representational, spatial, and spectral domains in all evaluated cases. We showed that protected images preserve content-driven feature organization with protection-specific substructure rather than inducing global representational drift. Detectability is governed by interacting effects of perturbation entropy, spatial deployment, and frequency alignment as revealed through combined synthetic and spectral analyses, with sequential protection amplifying detectable structure rather than suppressing it. Frequency-domain analysis further demonstrated that Glaze and Nightshade redistribute energy along dominant image-aligned frequency axes rather than introducing spectrally diffuse noise. These results suggested that contemporary image protection operates through structured feature-level deformation rather than semantic dislocation, providing mechanistic insight into why protection signals remain visually subtle yet consistently detectable. This work advances the interpretability of adversarial image protection and informs the design of future defenses and detection strategies for generative AI systems.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2512.08329

Country: North America > United States (0.46)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Diverse and Accurate Image Description Using a Variational Auto-Encoder with an Additive Gaussian Encoding Space

Liwei Wang, Alexander Schwing, Svetlana Lazebnik

Neural Information Processing SystemsNov-21-2025, 08:28:27 GMT

This paper explores image caption generation using conditional variational auto-encoders (CV AEs).

ag-cv ae, caption, gmm-cv ae, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

Filters

Collaborating Authors

image content

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Reconstructing the Image Stitching Pipeline: Integrating Fusion and Rectangling into a Unified Inpainting Model

e393677793767624f2821cec8bdd02f1-Paper-Conference.pdf

Image Restoration Using Very Deep Convolutional Encoder-Decoder Networks with Symmetric Skip Connections

Reconstructing the Image Stitching Pipeline: Integrating Fusion and Rectangling into a Unified Inpainting Model

GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction

ae2d574d2c309f3a45880e4460efd176-Supplemental-Conference.pdf

Explicitly disentangling image content from translation and rotation with spatial-VAE

Explicitly disentangling image content from translation and rotation with spatial-VAE

Interpreting Structured Perturbations in Image Protection Methods for Diffusion Models

Diverse and Accurate Image Description Using a Variational Auto-Encoder with an Additive Gaussian Encoding Space