Goto

Collaborating Authors

 image ground truth


Training Image Estimators without Image Ground Truth

Neural Information Processing Systems

Deep neural networks have been very successful in compressive-sensing and image restoration applications, as a means to estimate images from partial, blurry, or otherwise degraded measurements. These networks are trained on a large number of corresponding pairs of measurements and ground-truth images, and thus implicitly learn to exploit domain-specific image statistics. But unlike measurement data, it is often expensive or impractical to collect a large training set of ground-truth images in many application settings. In this paper, we introduce an unsupervised framework for training image estimation networks, from a training set that contains only measurements---with two varied measurements per image---but no ground-truth for the full images desired as output. We demonstrate that our framework can be applied for both regular and blind image estimation tasks, where in the latter case parameters of the measurement model (e.g., the blur kernel) are unknown: during inference, and potentially, also during training. We evaluate our framework for training networks for compressive-sensing and blind deconvolution, considering both non-blind and blind training for the latter. Our framework yields models that are nearly as accurate as those from fully supervised training, despite not having access to any ground-truth images.


Reviews: Training Image Estimators without Image Ground Truth

Neural Information Processing Systems

Originality: The paper is mainly based on the idea presented in [14] and could be considered a generalization of it. Section 3.2 is the part which makes this paper's originality clear. Quality: Quality is the issue which makes the reviewer to believe this paper is not ready for publication yet. Here are the issues: - First of all, there are few previous works on the exact same problem that are neither cited nor compared against in this manuscript. These papers even do not need either ground truth data or two sets of measurements (unlike the submitted paper) and have shown impressive results.


Reviews: Training Image Estimators without Image Ground Truth

Neural Information Processing Systems

This work introduces a new method to learn image restoration methods from only corrupted data sets. It is an exciting idea that could potentially open up new applications for deep learning methods in settings where it is not possible to obtain ground truth data. Three reviewers initially assessed the work as 5/9/6. Based on a strong author rebuttal all reviewers took part in a discussion and two reviewers revised their score upwards, for a final assessment of 6/9/7. Overall this paper contains an exciting idea and is likely to stimulate the NeurIPS community to further consider the setting of learning only from corrupted data.


Training Image Estimators without Image Ground Truth

Neural Information Processing Systems

Deep neural networks have been very successful in compressive-sensing and image restoration applications, as a means to estimate images from partial, blurry, or otherwise degraded measurements. These networks are trained on a large number of corresponding pairs of measurements and ground-truth images, and thus implicitly learn to exploit domain-specific image statistics. But unlike measurement data, it is often expensive or impractical to collect a large training set of ground-truth images in many application settings. In this paper, we introduce an unsupervised framework for training image estimation networks, from a training set that contains only measurements---with two varied measurements per image---but no ground-truth for the full images desired as output. We demonstrate that our framework can be applied for both regular and blind image estimation tasks, where in the latter case parameters of the measurement model (e.g., the blur kernel) are unknown: during inference, and potentially, also during training. We evaluate our framework for training networks for compressive-sensing and blind deconvolution, considering both non-blind and blind training for the latter.


Exploring Transfer Learning in Medical Image Segmentation using Vision-Language Models

Poudel, Kanchan, Dhakal, Manish, Bhandari, Prasiddha, Adhikari, Rabin, Thapaliya, Safal, Khanal, Bishesh

arXiv.org Artificial Intelligence

Medical image segmentation with deep learning is an important and widely studied topic because segmentation enables quantifying target structure size and shape that can help in disease diagnosis, prognosis, surgery planning, and understanding. Recent advances in the foundation Vision-Language Models (VLMs) and their adaptation to segmentation tasks in natural images with Vision-Language Segmentation Models (VLSMs) have opened up a unique opportunity to build potentially powerful segmentation models for medical images that enable providing helpful information via language prompt as input, leverage the extensive range of other medical imaging datasets by pooled dataset training, adapt to new classes, and be robust against out-of-distribution data with human-in-the-loop prompting during inference. Although transfer learning from natural to medical images for imageonly segmentation models has been studied, no studies have analyzed how the joint representation of vision-language transfers to medical images in segmentation problems and understand gaps in leveraging their full potential. We present the first benchmark study on transfer learning of VLSMs to 2D medical images with thoughtfully collected 11 existing 2D medical image datasets of diverse modalities with carefully presented 9 types of language prompts from 14 attributes. Our results indicate that VLSMs trained in natural image-text pairs transfer reasonably to the medical domain in zero-shot settings when prompted appropriately for non-radiology photographic modalities; when finetuned, they obtain comparable performance to conventional architectures, even in X-rays and ultrasound modalities. However, the additional benefit of language prompts during finetuning may be limited, with image features playing a more dominant role; they can better handle training on pooled datasets combining diverse modalities and are potentially more robust to domain shift than the conventional segmentation models. The code and datasets are released at https://github.com/naamiinepal/med


Training Image Estimators without Image Ground Truth

Xia, Zhihao, Chakrabarti, Ayan

Neural Information Processing Systems

Deep neural networks have been very successful in compressive-sensing and image restoration applications, as a means to estimate images from partial, blurry, or otherwise degraded measurements. These networks are trained on a large number of corresponding pairs of measurements and ground-truth images, and thus implicitly learn to exploit domain-specific image statistics. But unlike measurement data, it is often expensive or impractical to collect a large training set of ground-truth images in many application settings. In this paper, we introduce an unsupervised framework for training image estimation networks, from a training set that contains only measurements---with two varied measurements per image---but no ground-truth for the full images desired as output. We demonstrate that our framework can be applied for both regular and blind image estimation tasks, where in the latter case parameters of the measurement model (e.g., the blur kernel) are unknown: during inference, and potentially, also during training.