contour detection
Generative AI for Industrial Contour Detection: A Language-Guided Vision System
Gong, Liang, Tommy, null, Wang, null, Chaker, Sara, Dong, Yanchen, Bousetouane, Fouad, Morton, Brenden, Mendez, Mark
In this work, we present a language-guided generative vision system for remnant contour detection in manufacturing, designed to achieve CADlevel precision. The system is organized into three phases: (1) data preprocessing, (2) contour generation using a conditional GAN, and (3) multimodal contour refinement through vision-language modeling, where standardized prompts are crafted in a human-in-the-loop process and applied through image-text guided synthesis. On proprietary FabTrack datasets, the proposed system improved contour fidelity, enhancing edge continuity and geometric alignment while reducing manual tracing. For the refinement phase, we bench-marked several VLMs, including Google's Gemini 2.0 Flash, OpenAI's GPT-image-1 integrated within a VLM-guided workflow, and open-source baselines. Under standardized conditions, GPT-image-1 consistently outperformed Gemini 2.0 Flash in both structural accuracy and perceptual quality. These findings demonstrate the promise of VLM-guided generative workflows for advancing industrial CV beyond the limitations of classical pipelines and moving contour detection closer to CAD-level precision.
Reviews: Learning Deep Structured Multi-Scale Features using Attention-Gated CRFs for Contour Prediction
This paper proposes a gating mechanism to combine features from different levels in a CNN for the task of contour detection. The paper builds upon recent advances in using graphical models with CNN architectures [5,39] and augments these with attention. The paper also presents an ablation study where they analyze the impact of different parts of their architecture. Cons: 1. Unclear relationship to past works which use CRFs with CNNs [5,39] and other works such as [A,B] which express CRF inference as CNNs. The paper says it is inspired from [5,39] but does not describe the points of difference from [5, 39].
Learning Deep Structured Multi-Scale Features using Attention-Gated CRFs for Contour Prediction
Dan Xu, Wanli Ouyang, Xavier Alameda-Pineda, Elisa Ricci, Xiaogang Wang, Nicu Sebe
Recent works have shown that exploiting multi-scale representations deeply learned via convolutional neural networks (CNN) is of tremendous importance for accurate contour detection. This paper presents a novel approach for predicting contours which advances the state of the art in two fundamental aspects, i.e. multi-scale feature generation and fusion. Different from previous works directly considering multi-scale feature maps obtained from the inner layers of a primary CNN architecture, we introduce a hierarchical deep model which produces more rich and complementary representations. Furthermore, to refine and robustly fuse the representations learned at different scales, the novel Attention-Gated Conditional Random Fields (AG-CRFs) are proposed. The experiments ran on two publicly available datasets (BSDS500 and NYUDv2) demonstrate the effectiveness of the latent AG-CRF model and of the overall hierarchical framework.
Discriminatively Trained Sparse Code Gradients for Contour Detection
Finding contours in natural images is a fundamental problem that serves as the basis of many tasks such as image segmentation and object recognition. At the core of contour detection technologies are a set of hand-designed gradient features, used by most approaches including the state-of-the-art Global Pb (gPb) operator. In this work, we show that contour detection accuracy can be significantly improved by computing Sparse Code Gradients (SCG), which measure contrast using patch representations automatically learned through sparse coding. We use K-SVD for dictionary learning and Orthogonal Matching Pursuit for computing sparse codes on oriented local neighborhoods, and apply multi-scale pooling and power transforms before classifying them with linear SVMs. By extracting rich representations from pixels and avoiding collapsing them prematurely, Sparse Code Gradients effectively learn how to measure local contrasts and find contours. We improve the F-measure metric on the BSDS500 benchmark to 0.74 (up from 0.71 of gPb contours). Moreover, our learning approach can easily adapt to novel sensor data such as Kinect-style RGB-D cameras: Sparse Code Gradients on depth maps and surface normals lead to promising contour detection using depth and depth+color, as verified on the NYU Depth Dataset.
Discriminatively Trained Sparse Code Gradients for Contour Detection
Finding contours in natural images is a fundamental problem that serves as the basis of many tasks such as image segmentation and object recognition. At the core of contour detection technologies are a set of hand-designed gradient features, used by most existing approaches including the state-of-the-art Global Pb (gPb) operator. In this work, we show that contour detection accuracy can be significantly improved by computing Sparse Code Gradients (SCG), which measure contrast using patch representations automatically learned through sparse coding. We use K-SVD and Orthogonal Matching Pursuit for efficient dictionary learning and encoding, and use multi-scale pooling and power transforms to code oriented local neighborhoods before computing gradients and applying linear SVM. By extracting rich representations from pixels and avoiding collapsing them prematurely, Sparse Code Gradients effectively learn how to measure local contrasts and find contours.
Contour Detection using OpenCV (Python/C++)
Knowing where an object is in an image is called localization in computer vision. Using contour detection, we can detect the borders of objects, and therefore, localize them easily. Importantly, contour detection could be the very first step for many interesting applications such as image foreground extraction, simple image segmentation, detection and recognition. The official OpenCV documentation says: "The contours are a useful tool for shape analysis and object detection and recognition." Let us discuss contour detection using OpenCV. In this post, we are going to learn about contours and contour detection using OpenCV. Not only the theory, we will also cover a complete hands-on coding in both Python and C programming languages to have a first hand experience of contour detection using OpenCV. You can build some really cool applications using contour detection and OpenCV. The following outlines interesting ones. Now that we gave an idea of the content of this article, let us also see what we are going to cover in this tutorial. When we join all the points on the boundary of an object, we get a contour.
Recurrent neural circuits for contour detection
Linsley, Drew, Kim, Junkyung, Ashok, Alekh, Serre, Thomas
We introduce a deep recurrent neural network architecture that approximates visual cortical circuits (Mély et al., 2018). We show that this architecture, which we refer to as the γ-Net, learns to solve contour detection tasks with better sample efficiency than state-of-the-art feedforward networks, while also exhibiting a classic perceptual illusion, known as the orientation-tilt illusion. Correcting this illusion significantly reduces γ-Net contour detection accuracy by driving it to prefer lowlevel edges over high-level object boundary contours. Overall, our study suggests that the orientation-tilt illusion is a byproduct of neural circuits that help biological visual systems achieve robust and efficient contour detection, and that incorporating these circuits in artificial neural networks can improve computer vision. An open debate since the inception of vision science concerns why we experience visual illusions. Consider the class of "contextual" illusions, where the perceived qualities of an image region, such as its orientation or color, are biased by the qualities of surrounding image regions. A well-studied contextual illusion is the orientation-tilt illusion depicted in Figure 1a, where perception of the central grating's orientation is influenced by the orientation of the surrounding grating (O'Toole & Wenderoth, 1977). When the two orientations are similar, the central grating appears tilted slightly away from the surround (Figure 1a, top). When the two orientations are dissimilar, the central grating appears tilted slightly towards the surround (Figure 1a, bottom). Is the contextual bias of the orientation-tilt illusion a bug of biology or a byproduct of optimized neural computations? Over the past 50 years, there has been a number of neural circuit mechanisms proposed to explain individual contextual illusions (reviewed in Mély et al., 2018). Recently, Mély et al. (2018) proposed a cortical circuit, constrained by physiology of primate visual cortex (V1), that offers a unified explanation for contextual illusions across visual domains - from the orientation-tilt illusion to color induction. These illusions arise in the circuit from recurrent interactions between neural populations with receptive fields that tile visual space, leading to contextual (center/surround) effects.
Face Detection with Firebase ML Kit on Android
With ML Kit's face detection API, you can detect faces in an image, identify key facial features, and get the contours of detected faces. In this post, you'll learn face detection with Firebase ML Kit on Android With face detection, you can get the information you need to perform tasks like embellishing selfies and portraits, or generating avatars from a user's photo. Because ML Kit can perform face detection in real time, you can use it in applications like video chat or games that respond to the player's expressions. See the ML Kit quickstart sample on GitHub for an example of this API in use. To do so, add the following declaration to your app's AndroidManifest.xml
Discriminatively Trained Sparse Code Gradients for Contour Detection
Finding contours in natural images is a fundamental problem that serves as the basis of many tasks such as image segmentation and object recognition. At the core of contour detection technologies are a set of hand-designed gradient features, used by most existing approaches including the state-of-the-art Global Pb (gPb) operator. In this work, we show that contour detection accuracy can be significantly improved by computing Sparse Code Gradients (SCG), which measure contrast using patch representations automatically learned through sparse coding. We use K-SVD and Orthogonal Matching Pursuit for efficient dictionary learning and encoding, and use multi-scale pooling and power transforms to code oriented local neighborhoods before computing gradients and applying linear SVM. By extracting rich representations from pixels and avoiding collapsing them prematurely, Sparse Code Gradients effectively learn how to measure local contrasts and find contours.