An Inpainting-Infused Pipeline for Attire and Background Replacement

Perche-Mahlow, Felipe Rodrigues, Felipe-Zanella, André, Cruz-Castañeda, William Alberto, Amadeus, Marcellus

Feb-5-2024–arXiv.org Artificial Intelligence

The extraordinary advancement in Generative Artificial Intelligence (GenAI) has caused a transformative shift in our approach to complex tasks incorporating various modalities such as text, audio, video, and image generation. GenAI, as a broad category, excels at creating synthetic data that can closely mimic real-world phenomena, showcasing its prowess in diverse creative applications. In text generation, models like OpenAI's GPT (Generative Pre-trained Transformer) [OpenAI, 2023] are revolutionizing how society writes. These models, trained on massive corpora of text data, exhibit an impressive ability to understand context, generate coherent paragraphs, and even complete sentences in a very consistent way [Roumeliotis and Tselikas, 2023]. The ability to produce fluent and relevant textual content has established applications in natural language processing, content creation, and even automated writing [Huang and Tan, 2023]. Audio generation models, exemplified by technologies such as Tacotron [Wang et al., 2017] and WaveNet [Oord et al., 2016], have significantly advanced our ability to synthesize realistic speech patterns. These models take advantage of deep neural networks to capture the intricacies of human speech, producing natural-sounding voices and musical compositions with nuanced variations in tone, pitch, and rhythm [Ning et al., 2019]. Image generation, a focal point of our discussion, has witnessed the evolution of models such as DALL-E [Betker et al., 2023, Ramesh et al., 2021], MidJourney [mid, 2022], and Stable Diffusion [Rombach et al., 2022], which can generate diverse and intricate images from textual prompts.

background, diffusion model, pipeline, (14 more...)

arXiv.org Artificial Intelligence

Feb-5-2024

arXiv.org PDF

Add feedback

Country:
- South America > Brazil
  - São Paulo (0.05)
- Europe
  - Switzerland > Zürich
    - Zürich (0.04)
  - Italy > Calabria
    - Catanzaro Province > Catanzaro (0.04)
- Africa > Middle East
  - Egypt (0.04)

Genre:
- Research Report (1.00)
- Overview > Innovation (0.48)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning > Generative AI (1.00)