AITopics | Arar, Moab

Collaborating Authors

Arar, Moab

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

PALP: Prompt Aligned Personalization of Text-to-Image Models

Arar, Moab, Voynov, Andrey, Hertz, Amir, Avrahami, Omri, Fruchter, Shlomi, Pritch, Yael, Cohen-Or, Daniel, Shamir, Ariel

arXiv.org Artificial IntelligenceJan-11-2024

Content creators often aim to create personalized images using personal subjects that go beyond the capabilities of conventional text-to-image models. Additionally, they may want the resulting image to encompass a specific location, style, ambiance, and more. Existing personalization methods may compromise personalization ability or the alignment to complex textual prompts. This trade-off can impede the fulfillment of user prompts and subject fidelity. We propose a new approach focusing on personalization methods for a \emph{single} prompt to address this issue. We term our approach prompt-aligned personalization. While this may seem restrictive, our method excels in improving text alignment, enabling the creation of images with complex and intricate prompts, which may pose a challenge for current techniques. In particular, our method keeps the personalized model aligned with a target prompt using an additional score distillation sampling term. We demonstrate the versatility of our method in multi- and single-shot settings and further show that it can compose multiple subjects or use inspiration from reference images, such as artworks. We compare our approach quantitatively and qualitatively with existing baselines and state-of-the-art techniques.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2401.06105

Country:

North America > Canada (0.29)
North America > United States (0.28)
Asia > Middle East > Israel (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

AnyLens: A Generative Diffusion Model with Any Rendering Lens

Voynov, Andrey, Hertz, Amir, Arar, Moab, Fruchter, Shlomi, Cohen-Or, Daniel

arXiv.org Artificial IntelligenceNov-29-2023

State-of-the-art diffusion models can generate highly realistic images based on various conditioning like text, segmentation, and depth. However, an essential aspect often overlooked is the specific camera geometry used during image capture. The influence of different optical systems on the final scene appearance is frequently overlooked. This study introduces a framework that intimately integrates a text-to-image diffusion model with the particular lens geometry used in image rendering. Our method is based on a per-pixel coordinate conditioning method, enabling the control over the rendering geometry. Notably, we demonstrate the manipulation of curvature properties, achieving diverse visual effects, such as fish-eye, panoramic views, and spherical texturing using a single diffusion model.

artificial intelligence, conditioning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2311.17609

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.64)

Industry: Media > Photography (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The Chosen One: Consistent Characters in Text-to-Image Diffusion Models

Avrahami, Omri, Hertz, Amir, Vinker, Yael, Arar, Moab, Fruchter, Shlomi, Fried, Ohad, Cohen-Or, Daniel, Lischinski, Dani

arXiv.org Artificial IntelligenceNov-27-2023

Recent advances in text-to-image generation models have unlocked vast potential for visual creativity. However, these models struggle with generation of consistent characters, a crucial aspect for numerous real-world applications such as story visualization, game development asset design, advertising, and more. Current methods typically rely on multiple pre-existing images of the target character or involve labor-intensive manual processes. In this work, we propose a fully automated solution for consistent character generation, with the sole input being a text prompt. We introduce an iterative procedure that, at each stage, identifies a coherent set of images sharing a similar identity and extracts a more consistent identity from this set. Our quantitative analysis demonstrates that our method strikes a better balance between prompt alignment and identity consistency compared to the baseline methods, and these findings are reinforced by a user study. To conclude, we showcase several practical applications of our approach. Project page is available at https://omriavrahami.com/the-chosen-one

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2311.10093

Country: Asia > Middle East > Israel (0.14)

Genre:

Research Report (1.00)
Questionnaire & Opinion Survey (0.70)

Industry:

Media (0.68)
Information Technology > Software (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.46)

Add feedback

Domain-Agnostic Tuning-Encoder for Fast Personalization of Text-To-Image Models

Arar, Moab, Gal, Rinon, Atzmon, Yuval, Chechik, Gal, Cohen-Or, Daniel, Shamir, Ariel, Bermano, Amit H.

arXiv.org Artificial IntelligenceJul-13-2023

Text-to-image (T2I) personalization allows users to guide the creative image generation process by combining their own visual concepts in natural language prompts. Recently, encoder-based techniques have emerged as a new effective approach for T2I personalization, reducing the need for multiple images and long training times. However, most existing encoders are limited to a single-class domain, which hinders their ability to handle diverse concepts. In this work, we propose a domain-agnostic method that does not require any specialized dataset or prior information about the personalized concepts. We introduce a novel contrastive-based regularization technique to maintain high fidelity to the target concept characteristics while keeping the predicted embeddings close to editable regions of the latent space, by pushing the predicted tokens toward their nearest existing CLIP tokens. Our experimental results demonstrate the effectiveness of our approach and show how the learned tokens are more semantic than tokens predicted by unregularized models. This leads to a better representation that achieves state-of-the-art performance while being more flexible than previous methods.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2307.06925

Country:

North America > United States (0.28)
Asia > Middle East > Israel (0.15)
North America > Canada > British Columbia (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Single Motion Diffusion

Raab, Sigal, Leibovitch, Inbal, Tevet, Guy, Arar, Moab, Bermano, Amit H., Cohen-Or, Daniel

arXiv.org Artificial IntelligenceJun-13-2023

Synthesizing realistic animations of humans, animals, and even imaginary creatures, has long been a goal for artists and computer graphics professionals. Compared to the imaging domain, which is rich with large available datasets, the number of data instances for the motion domain is limited, particularly for the animation of animals and exotic creatures (e.g., dragons), which have unique skeletons and motion patterns. In this work, we present a Single Motion Diffusion Model, dubbed SinMDM, a model designed to learn the internal motifs of a single motion sequence with arbitrary topology and synthesize motions of arbitrary length that are faithful to them. We harness the power of diffusion models and present a denoising network explicitly designed for the task of learning from a single input motion. SinMDM is designed to be a lightweight architecture, which avoids overfitting by using a shallow network with local attention layers that narrow the receptive field and encourage motion diversity. SinMDM can be applied in various contexts, including spatial and temporal in-betweening, motion expansion, style transfer, and crowd animation. Our results show that SinMDM outperforms existing methods both in quality and time-space efficiency. Moreover, while current approaches require additional training for different applications, our work facilitates these applications at inference time. Our code and trained models are available at https://sinmdm.github.io/SinMDM-page.

animation, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2302.05905

Country:

North America > United States (0.75)
Asia > Middle East > Israel (0.14)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Graphics > Animation (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Focus-and-Expand: Training Guidance Through Gradual Manipulation of Input Features

Arar, Moab, Fish, Noa, Daniel, Dani, Tenetov, Evgeny, Shamir, Ariel, Bermano, Amit

arXiv.org Machine LearningJul-15-2020

We present a simple and intuitive Focus-and-eXpand (\fax) method to guide the training process of a neural network towards a specific solution. Optimizing a neural network is a highly non-convex problem. Typically, the space of solutions is large, with numerous possible local minima, where reaching a specific minimum depends on many factors. In many cases, however, a solution which considers specific aspects, or features, of the input is desired. For example, in the presence of bias, a solution that disregards the biased feature is a more robust and accurate one. Drawing inspiration from Parameter Continuation methods, we propose steering the training process to consider specific features in the input more than others, through gradual shifts in the input domain. \fax extracts a subset of features from each input data-point, and exposes the learner to these features first, Focusing the solution on them. Then, by using a blending/mixing parameter $\alpha$ it gradually eXpands the learning process to include all features of the input. This process encourages the consideration of the desired features more than others. Though not restricted to this field, we quantitatively evaluate the effectiveness of our approach on various Computer Vision tasks, and achieve state-of-the-art bias removal, improvements to an established augmentation method, and two examples of improvements to image classification tasks. Through these few examples we demonstrate the impact this approach potentially carries for a wide variety of problems, which stand to gain from understanding the solution landscape.

artificial intelligence, background, neural network, (19 more...)

arXiv.org Machine Learning

2007.07723

Country:

North America > United States (0.46)
Asia > Middle East (0.28)
North America > Canada > Ontario > Toronto (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback