AITopics | input sketch

Collaborating Authors

input sketch

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

KnobGen: Controlling the Sophistication of Artwork in Sketch-Based Diffusion Models

Navard, Pouyan, Monsefi, Amin Karimi, Zhou, Mengxi, Chao, Wei-Lun, Yilmaz, Alper, Ramnath, Rajiv

arXiv.org Artificial IntelligenceOct-11-2024

Recent advances in diffusion models have significantly improved text-to-image (T2I) generation, but they often struggle to balance fine-grained precision with high-level control. Methods like ControlNet and T2I-Adapter excel at following sketches by seasoned artists but tend to be overly rigid, replicating unintentional flaws in sketches from novice users. Meanwhile, coarse-grained methods, such as sketch-based abstraction frameworks, offer more accessible input handling but lack the precise control needed for detailed, professional use. To address these limitations, we propose KnobGen, a dual-pathway framework that democratizes sketch-based image generation by seamlessly adapting to varying levels of sketch complexity and user skill. KnobGen uses a Coarse-Grained Controller (CGC) module for high-level semantics and a Fine-Grained Controller (FGC) module for detailed refinement. The relative strength of these two modules can be adjusted through our knob inference mechanism to align with the user's specific needs. These mechanisms ensure that KnobGen can flexibly generate images from both novice sketches and those drawn by seasoned artists. This maintains control over the final output while preserving the natural appearance of the image, as evidenced on the MultiGen-20M dataset and a newly collected sketch dataset.

diffusion model, module, sketch, (13 more...)

arXiv.org Artificial Intelligence

2410.01595

Country:

North America > United States > Ohio (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

VisioBlend: Sketch and Stroke-Guided Denoising Diffusion Probabilistic Model for Realistic Image Generation

Devmurari, Harshkumar, Kuckian, Gautham, Vishwakarma, Prajjwal, Vartak, Krunali

arXiv.org Artificial IntelligenceMay-15-2024

Generating images from hand-drawings is a crucial and fundamental task in content creation. The translation is challenging due to the infinite possibilities and the diverse expectations of users. However, traditional methods are often limited by the availability of training data. Therefore, VisioBlend, a unified framework supporting three-dimensional control over image synthesis from sketches and strokes based on diffusion models, is proposed. It enables users to decide the level of faithfulness to the input strokes and sketches. VisioBlend achieves state-of-the-art performance in terms of realism and flexibility, enabling various applications in image synthesis from sketches and strokes. It solves the problem of data availability by synthesizing new data points from hand-drawn sketches and strokes, enriching the dataset and enabling more robust and diverse image synthesis. This work showcases the power of diffusion models in image creation, offering a user-friendly and versatile approach for turning artistic visions into reality.

diffusion model, image generation, sketch, (17 more...)

arXiv.org Artificial Intelligence

2407.05209

Country: Asia > India > Maharashtra > Mumbai (0.05)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

U-Sketch: An Efficient Approach for Sketch to Image Diffusion Models

Mitsouras, Ilias, Tsonis, Eleftherios, Tzouveli, Paraskevi, Voulodimos, Athanasios

arXiv.org Artificial IntelligenceMar-27-2024

Diffusion models have demonstrated remarkable performance in text-to-image synthesis, producing realistic and high resolution images that faithfully adhere to the corresponding text-prompts. Despite their great success, they still fall behind in sketch-to-image synthesis tasks, where in addition to text-prompts, the spatial layout of the generated images has to closely follow the outlines of certain reference sketches. Employing an MLP latent edge predictor to guide the spatial layout of the synthesized image by predicting edge maps at each denoising step has been recently proposed. Despite yielding promising results, the pixel-wise operation of the MLP does not take into account the spatial layout as a whole, and demands numerous denoising iterations to produce satisfactory images, leading to time inefficiency. To this end, we introduce U-Sketch, a framework featuring a U-Net type latent edge predictor, which is capable of efficiently capturing both local and global features, as well as spatial correlations between pixels. Moreover, we propose the addition of a sketch simplification network that offers the user the choice of preprocessing and simplifying input sketches for enhanced outputs. The experimental results, corroborated by user feedback, demonstrate that our proposed U-Net latent edge predictor leads to more realistic results, that are better aligned with the spatial outlines of the reference sketches, while drastically reducing the number of required denoising steps and, consequently, the overall execution time.

input sketch, latent edge predictor, sketch, (14 more...)

arXiv.org Artificial Intelligence

2403.18425

Genre: Research Report (0.83)

Industry: Information Technology (0.36)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Breathing Life Into Sketches Using Text-to-Video Priors

Gal, Rinon, Vinker, Yael, Alaluf, Yuval, Bermano, Amit H., Cohen-Or, Daniel, Shamir, Ariel, Chechik, Gal

arXiv.org Artificial IntelligenceNov-21-2023

A sketch is one of the most intuitive and versatile tools humans use to convey their ideas visually. An animated sketch opens another dimension to the expression of ideas and is widely used by designers for a variety of purposes. Animating sketches is a laborious process, requiring extensive experience and professional design skills. In this work, we present a method that automatically adds motion to a single-subject sketch (hence, "breathing life into it"), merely by providing a text prompt indicating the desired motion. The output is a short animation provided in vector representation, which can be easily edited. Our method does not require extensive training, but instead leverages the motion prior of a large pretrained text-to-video diffusion model using a score-distillation loss to guide the placement of strokes. To promote natural and smooth motion and to better preserve the sketch's appearance, we model the learned motion through two components. The first governs small local deformations and the second controls global affine transformations. Surprisingly, we find that even models that struggle to generate sketch videos on their own can still serve as a useful backbone for animating abstract representations.

animation, representation, sketch, (15 more...)

arXiv.org Artificial Intelligence

2311.13608

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Asia > Indonesia > Sulawesi (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

SENS: Sketch-based Implicit Neural Shape Modeling

Binninger, Alexandre, Hertz, Amir, Sorkine-Hornung, Olga, Cohen-Or, Daniel, Giryes, Raja

arXiv.org Artificial IntelligenceJun-9-2023

We present SENS, a novel method for generating and editing 3D models from hand-drawn sketches, including those of an abstract nature. Our method allows users to quickly and easily sketch a shape, and then maps the sketch into the latent space of a part-aware neural implicit shape architecture. SENS analyzes the sketch and encodes its parts into ViT patch encoding, then feeds them into a transformer decoder that converts them to shape embeddings, suitable for editing 3D neural implicit shapes. SENS not only provides intuitive sketch-based generation and editing, but also excels in capturing the intent of the user's sketch to generate a variety of novel and expressive 3D shapes, even from abstract sketches. We demonstrate the effectiveness of our model compared to the state-of-the-art using objective metric evaluation criteria and a decisive user study, both indicating strong performance on sketches with a medium level of abstraction. Furthermore, we showcase its intuitive sketch-based shape editing capabilities.

artificial intelligence, machine learning, sketch, (15 more...)

arXiv.org Artificial Intelligence

2306.06088

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

DiffSketching: Sketch Control Image Synthesis with Diffusion Models

Wang, Qiang, Kong, Di, Lin, Fengyin, Qi, Yonggang

arXiv.org Artificial IntelligenceMay-30-2023

Creative sketch is a universal way of visual expression, but translating images from an abstract sketch is very challenging. Traditionally, creating a deep learning model for sketch-to-image synthesis needs to overcome the distorted input sketch without visual details, and requires to collect large-scale sketch-image datasets. We first study this task by using diffusion models. Our model matches sketches through the cross domain constraints, and uses a classifier to guide the image synthesis more accurately. Extensive experiments confirmed that our method can not only be faithful to user's input sketches, but also maintain the diversity and imagination of synthetic image results. Our model can beat GAN-based method in terms of generation quality and human evaluation, and does not rely on massive sketch-image datasets. Additionally, we present applications of our method in image editing and interpolation.

artificial intelligence, arxiv preprint arxiv, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2305.18812

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

DiffFaceSketch: High-Fidelity Face Image Synthesis with Sketch-Guided Latent Diffusion Model

Peng, Yichen, Zhao, Chunqi, Xie, Haoran, Fukusato, Tsukasa, Miyata, Kazunori

arXiv.org Artificial IntelligenceFeb-26-2023

Synthesizing face images from monochrome sketches is one of the most fundamental tasks in the field of image-to-image translation. However, it is still challenging to (1)~make models learn the high-dimensional face features such as geometry and color, and (2)~take into account the characteristics of input sketches. Existing methods often use sketches as indirect inputs (or as auxiliary inputs) to guide the models, resulting in the loss of sketch features or the alteration of geometry information. In this paper, we introduce a Sketch-Guided Latent Diffusion Model (SGLDM), an LDM-based network architect trained on the paired sketch-face dataset. We apply a Multi-Auto-Encoder (AE) to encode the different input sketches from different regions of a face from pixel space to a feature map in latent space, which enables us to reduce the dimension of the sketch input while preserving the geometry-related information of local face details. We build a sketch-face paired dataset based on the existing method that extracts the edge map from an image. We then introduce a Stochastic Region Abstraction (SRA), an approach to augment our dataset to improve the robustness of SGLDM to handle sketch input with arbitrary abstraction. The evaluation study shows that SGLDM can synthesize high-quality face images with different expressions, facial accessories, and hairstyles from various sketches with different abstraction levels.

artificial intelligence, machine learning, sketch, (18 more...)

arXiv.org Artificial Intelligence

2302.06908

Country:

North America > United States > Washington > King County > Seattle (0.05)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.05)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
(8 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

MaskSketch: Unpaired Structure-guided Masked Image Generation

Bashkirova, Dina, Lezama, Jose, Sohn, Kihyuk, Saenko, Kate, Essa, Irfan

arXiv.org Artificial IntelligenceFeb-10-2023

Recent conditional image generation methods produce images of remarkable diversity, fidelity and realism. However, the majority of these methods allow conditioning only on labels or text prompts, which limits their level of control over the generation result. In this paper, we introduce MaskSketch, an image generation method that allows spatial conditioning of the generation result using a guiding sketch as an extra conditioning signal during sampling. MaskSketch utilizes a pre-trained masked generative transformer, requiring no model training or paired supervision, and works with input sketches of different levels of abstraction. We show that intermediate self-attention maps of a masked generative transformer encode important structural information of the input image, such as scene layout and object shape, and we propose a novel sampling method based on this observation to enable structure-guided generation. Our results show that MaskSketch achieves high image realism and fidelity to the guiding structure. Evaluated on standard benchmark datasets, MaskSketch outperforms state-of-the-art methods for sketch-to-image translation, as well as unpaired image-to-image translation approaches.

artificial intelligence, machine learning, masksketch, (17 more...)

arXiv.org Artificial Intelligence

2302.05496

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

TreeSketchNet: From Sketch To 3D Tree Parameters Generation

Manfredi, Gilda, Capece, Nicola, Erra, Ugo, Gruosso, Monica

arXiv.org Artificial IntelligenceOct-27-2022

3D modeling of non-linear objects from stylized sketches is a challenge even for experts in Computer Graphics (CG). The extrapolation of objects parameters from a stylized sketch is a very complex and cumbersome task. In the present study, we propose a broker system that mediates between the modeler and the 3D modelling software and can transform a stylized sketch of a tree into a complete 3D model. The input sketches do not need to be accurate or detailed, and only need to represent a rudimentary outline of the tree that the modeler wishes to 3D-model. Our approach is based on a well-defined Deep Neural Network (DNN) architecture, we called TreeSketchNet (TSN), based on convolutions and able to generate Weber and Penn parameters that can be interpreted by the modelling software to generate a 3D model of a tree starting from a simple sketch. The training dataset consists of Synthetically-Generated \revision{(SG)} sketches that are associated with Weber-Penn parameters generated by a dedicated Blender modelling software add-on. The accuracy of the proposed method is demonstrated by testing the TSN with both synthetic and hand-made sketches. Finally, we provide a qualitative analysis of our results, by evaluating the coherence of the predicted parameters with several distinguishing features.

artificial intelligence, machine learning, sketch, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3579831

2207.12297

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(16 more...)

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

A Sketch Is Worth a Thousand Words: Image Retrieval with Text and Sketch

Sangkloy, Patsorn, Jitkrittum, Wittawat, Yang, Diyi, Hays, James

arXiv.org Artificial IntelligenceAug-5-2022

We address the problem of retrieving images with both a sketch and a text query. We present TASK-former (Text And SKetch transformer), an end-to-end trainable model for image retrieval using a text description and a sketch as input. We argue that both input modalities complement each other in a manner that cannot be achieved easily by either one alone. TASK-former follows the late-fusion dual-encoder approach, similar to CLIP, which allows efficient and scalable retrieval since the retrieval set can be indexed independently of the queries. We empirically demonstrate that using an input sketch (even a poorly drawn one) in addition to text considerably increases retrieval recall compared to traditional text-based image retrieval. To evaluate our approach, we collect 5,000 hand-drawn sketches for images in the test set of the COCO dataset. The collected sketches are available a https://janesjanes.github.io/tsbir/.

image retrieval, retrieval, sketch, (12 more...)

arXiv.org Artificial Intelligence

2208.03354

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
Asia > Taiwan > Taiwan Province > Taipei (0.04)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Sports > Tennis (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Sketch Understanding (0.72)

Add feedback