AITopics | image inpainting

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Nakada, Hyakka, Kubota, Marika

What Shape Is Optimal for Masks in Text Removal?

arXiv.org Artificial IntelligenceDec-1-2025

The advent of generative models has dramatically improved the accuracy of image inpainting. In particular, by removing specific text from document images, reconstructing original images is extremely important for industrial applications. However, most existing methods of text removal focus on deleting simple scene text which appears in images captured by a camera in an outdoor environment. There is little research dedicated to complex and practical images with dense text. Therefore, we created benchmark data for text removal from images including a large amount of text. From the data, we found that text-removal performance becomes vulnerable against mask profile perturbation. Thus, for practical text-removal tasks, precise tuning of the mask shape is essential. This study developed a method to model highly flexible mask profiles and learn their parameters using Bayesian optimization. The resulting profiles were found to be character-wise masks. It was also found that the minimum cover of a text region is not optimal. Our research is expected to pave the way for a user-friendly guideline for manual masking.

machine learning, natural language, text removal, (18 more...)

2511.22499

Country:

Europe (0.67)
Asia > Japan > Honshū (0.28)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Agarwal, Umang, Sangore, Rudraksh, Laddha, Sumit

From Diffusion to One-Step Generation: A Comparative Study of Flow-Based Models with Application to Image Inpainting

arXiv.org Artificial IntelligenceNov-27-2025

We present a comprehensive comparative study of three generative modeling paradigms: Denoising Diffusion Probabilistic Models (DDPM), Conditional Flow Matching (CFM), and MeanFlow. While DDPM and CFM require iterative sampling, MeanFlow enables direct one-step generation by modeling the average velocity over time intervals. We implement all three methods using a unified TinyUNet architecture (<1.5M parameters) on CIFAR-10, demonstrating that CFM achieves an FID of 24.15 with 50 steps, significantly outperforming DDPM (FID 402.98). MeanFlow achieves FID 29.15 with single-step sampling -- a 50X reduction in inference time. We further extend CFM to image inpainting, implementing mask-guided sampling with four mask types (center, random bbox, irregular, half). Our fine-tuned inpainting model achieves substantial improvements: PSNR increases from 4.95 to 8.57 dB on center masks (+73%), and SSIM improves from 0.289 to 0.418 (+45%), demonstrating the effectiveness of inpainting-aware training.

artificial intelligence, machine learning, meanflow, (17 more...)

2511.21215

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsNov-20-2025, 22:23:59 GMT

Image Inpainting via Generative Multi-column Convolutional Neural Networks

In this paper, we propose a generative multi-column network for image inpainting.

generative multi-column convolutional neural network, image inpainting, name change, (2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Neural Information Processing SystemsOct-2-2025, 22:47:33 GMT

5301c4d888f5204274439e6dcf5fdb54-Paper.pdf

plane, plane segmentation, segmentation, (15 more...)

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Industry: Media (0.30)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Modak, Sourav, Saltık, Ahmet Oğuz, Stein, Anthony

Exploring Model Quantization in GenAI-based Image Inpainting and Detection of Arable Plants

arXiv.org Artificial IntelligenceMar-4-2025

Deep learning-based weed control systems often suffer from limited training data diversity and constrained on-board computation, impacting their real-world performance. To overcome these challenges, we propose a framework that leverages Stable Diffusion-based inpainting to augment training data progressively in 10% increments -- up to an additional 200%, thus enhancing both the volume and diversity of samples. Our approach is evaluated on two state-of-the-art object detection models, YOLO11(l) and RT-DETR(l), using the mAP50 metric to assess detection performance. We explore quantization strategies (FP16 and INT8) for both the generative inpainting and detection models to strike a balance between inference speed and accuracy. Deployment of the downstream models on the Jetson Orin Nano demonstrates the practical viability of our framework in resource-constrained environments, ultimately improving detection accuracy and computational efficiency in intelligent weed management systems.

augmentation, rt-detr, yolo11, (12 more...)

2503.0242

Country: Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)

Genre: Research Report (1.00)

Industry: Food & Agriculture > Agriculture (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsJan-18-2025, 05:41:43 GMT

Visual Prompting via Image Inpainting

detection, image inpainting, visual prompting

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.44)

Neural Information Processing SystemsOct-7-2024, 13:06:56 GMT

Reviews: Image Inpainting via Generative Multi-column Convolutional Neural Networks

Summary This submission tackles the problem of image inpainting. It adapts the idea of multi-scale predictions by running three branches that predict features at different scales which are concatenated before two convolutions create the final prediction. The loss consists of three components. The method is evaluated on five diverse datasets and with several ablation studies analyzing the influence of different parts. Strengths - The ablation study shows the contribution of the different components well.

generative multi-column convolutional neural network, image inpainting, prediction, (5 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

arXiv.org Artificial IntelligenceFeb-15-2024

HI-GAN: Hierarchical Inpainting GAN with Auxiliary Inputs for Combined RGB and Depth Inpainting

Dash, Ankan, Gu, Jingyi, Wang, Guiling

Inpainting involves filling in missing pixels or areas in an image, a crucial technique employed in Mixed Reality environments for various applications, particularly in Diminished Reality (DR) where content is removed from a user's visual environment. Existing methods rely on digital replacement techniques which necessitate multiple cameras and incur high costs. AR devices and smartphones use ToF depth sensors to capture scene depth maps aligned with RGB images. Despite speed and affordability, ToF cameras create imperfect depth maps with missing pixels. To address the above challenges, we propose Hierarchical Inpainting GAN (HI-GAN), a novel approach comprising three GANs in a hierarchical fashion for RGBD inpainting. EdgeGAN and LabelGAN inpaint masked edge and segmentation label images respectively, while CombinedRGBD-GAN combines their latent representation outputs and performs RGB and Depth inpainting. Edge images and particularly segmentation label images as auxiliary inputs significantly enhance inpainting performance by complementary context and hierarchical optimization. We believe we make the first attempt to incorporate label images into inpainting process.Unlike previous approaches requiring multiple sequential models and separate outputs, our work operates in an end-to-end manner, training all three models simultaneously and hierarchically. Specifically, EdgeGAN and LabelGAN are first optimized separately and further optimized inside CombinedRGBD-GAN to enhance inpainting quality. Experiments demonstrate that HI-GAN works seamlessly and achieves overall superior performance compared with existing approaches.

depth image, hi-gan, labelgan, (15 more...)

2402.10334

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey (0.04)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Gupta, Shubham, Ravishankar, Rahul Kunigal, Gangaraju, Madhoolika, Dwarkanath, Poojasree, Subramanyam, Natarajan

WSSL: Weighted Self-supervised Learning Framework For Image-inpainting

arXiv.org Artificial IntelligenceAug-24-2023

Image inpainting is the process of regenerating lost parts of the image. Supervised algorithm-based methods have shown excellent results but have two significant drawbacks. They do not perform well when tested with unseen data. They fail to capture the global context of the image, resulting in a visually unappealing result. We propose a novel self-supervised learning framework for image-inpainting: Weighted Self-Supervised Learning (WSSL) to tackle these problems. We designed WSSL to learn features from multiple weighted pretext tasks. These features are then utilized for the downstream task, image-inpainting. To improve the performance of our framework and produce more visually appealing images, we also present a novel loss function for image inpainting. The loss function takes advantage of both reconstruction loss and perceptual loss functions to regenerate the image. Our experimentation shows WSSL outperforms previous methods, and our loss function helps produce better results.

loss function, pretext task, representation, (12 more...)

2211.13856

Country:

North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
Europe > Germany (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.84)