kandinsky
Uncovering Conceptual Blindspots in Generative Image Models Using Sparse Autoencoders
Bohacek, Matyas, Fel, Thomas, Agrawala, Maneesh, Lubana, Ekdeep Singh
Despite their impressive performance, generative image models trained on large-scale datasets frequently fail to produce images with seemingly simple concepts -- e.g., human hands or objects appearing in groups of four -- that are reasonably expected to appear in the training data. These failure modes have largely been documented anecdotally, leaving open the question of whether they reflect idiosyncratic anomalies or more structural limitations of these models. To address this, we introduce a systematic approach for identifying and characterizing "conceptual blindspots" -- concepts present in the training data but absent or misrepresented in a model's generations. Our method leverages sparse autoencoders (SAEs) to extract interpretable concept embeddings, enabling a quantitative comparison of concept prevalence between real and generated images. We train an archetypal SAE (RA-SAE) on DINOv2 features with 32,000 concepts -- the largest such SAE to date -- enabling fine-grained analysis of conceptual disparities. Applied to four popular generative models (Stable Diffusion 1.5/2.1, PixArt, and Kandinsky), our approach reveals specific suppressed blindspots (e.g., bird feeders, DVD discs, and whitespaces on documents) and exaggerated blindspots (e.g., wood background texture and palm trees). At the individual datapoint level, we further isolate memorization artifacts -- instances where models reproduce highly specific visual templates seen during training. Overall, we propose a theoretically grounded framework for systematically identifying conceptual blindspots in generative models by assessing their conceptual fidelity with respect to the underlying data-generating process.
Human Aesthetic Preference-Based Large Text-to-Image Model Personalization: Kandinsky Generation as an Example
Zhou, Aven-Le, Wang, Yu-Ao, Wu, Wei, Zhang, Kang
With the advancement of neural generative capabilities, the art community has actively embraced GenAI (generative artificial intelligence) for creating painterly content. Large text-to-image models can quickly generate aesthetically pleasing outcomes. However, the process can be non-deterministic and often involves tedious trial-and-error, as users struggle with formulating effective prompts to achieve their desired results. This paper introduces a prompting-free generative approach that empowers users to automatically generate personalized painterly content that incorporates their aesthetic preferences in a customized artistic style. This approach involves utilizing ``semantic injection'' to customize an artist model in a specific artistic style, and further leveraging a genetic algorithm to optimize the prompt generation process through real-time iterative human feedback. By solely relying on the user's aesthetic evaluation and preference for the artist model-generated images, this approach creates the user a personalized model that encompasses their aesthetic preferences and the customized artistic style.
Brain implants turn imagined handwriting into text on a screen / Humans + Tech - #80
If you've never heard colours, you can now do so. Researchers planted tiny electrodes on the surface of the brain of a man paralysed from the neck down. As he imagined writing letters with his hand, the researchers analysed the neural patterns for each letter. They created an algorithm that transformed these neural patterns into words on a screen [Anushree Dave, ScienceNews]. From his brain activity alone, the participant produced 90 characters, or 15 words, per minute, Krishna Shenoy, a Howard Hughes Medical Institute investigator at Stanford University, and colleagues report May 12 in Nature.
Google tries to replicate synesthesia with its latest experiment
Google Arts & Culture has teamed up with the Centre Pompidou, a cultural complex in Paris, to pay tribute to Vassily Kandinsky with a virtual exhibition of the artist's works and other documents. You can view some of Kandinsky's pieces in an augmented reality gallery. At the heart of the exhibit is a machine learning experiment that tries to replicate synesthesia, a condition the abstract art pioneer had. In a nutshell, synesthesia turns information that stimulates one of your senses into a multi-sensory experience. For some people (including Kandinsky, Billie Eilish and Pharrell Williams), the condition deepens the association between colors and sounds or moods.
The new tool in the art of spotting forgeries: artificial intelligence
In late March, a judge in Wiesbaden, Germany, found herself playing the uncomfortable role of art critic. On trial before her were two men accused of forging paintings by artists including Kazimir Malevich and Wassily Kandinsky, whose angular, abstract compositions can now go for eight-figure prices. The case had been in progress for three and a half years and was seen by many as a test. A successful prosecution could help end an epidemic of forgeries – so-called miracle pictures that appear from nowhere – that have been plaguing the market in avant-garde Russian art. But as the trial reached its climax, it disintegrated into farce.